We made a thing!

Part of a second quarter computer science course, Alden, Anna, and I built a little app of which we’re pretty proud! The development process taught us much about working with data, databases, and integrating Django and front-end tools. We also made functioned as a fantastic team, with Anna leading in front-end, Alden programming a priorities-based ranking algorithm, and me working with the data, database, and Django integration.

Data was one of the project’s main challenges. Data integration aside, we were not able to find data on more than US 100 cities. We’d also would’ve liked to make this application at the zipcode level, but this is unfeasable for a ten week project.

But, we now have a pet project we can improve by integrating new tools we learn. Some of the things we hope to include someday:

  • more data granularity
  • improved algorithm
  • multi-user preferences (for partners)
  • distance to mom feature

I am certainly a bit spooked with three midterms next week. The first five weeks of graduate school at the University of Chicago Harris School have been a test of perseverance. While the things I have learned go far beyond the professional realm, here are some things I have learned/done so far and some thoughts on future endeavors.

What I’ve learned:

  • Probability theory, including understanding and computing conditional probabilities and set theory
  • Various discrete and continuous distributions and their uses. I can now consider some data and think to myself: “this follows an exponential distribution!”
  • Object-oriented programming theory I had not worked with before, including more work with classes in Python, functional programming, and abstraction (one of the hardest tasks for me)
  • Continued using R and SQL on a research project. I specifically learned important concepts for data storage and memory usage
  • Perseverance when faced with challenging tasks. I’ve spent hours coding, debugging, and composing statistics proofs. My problem solving perseverance threshold is certainly much higher!

What I want to do now:

  • I started a little pet data project on social media discussions of various controversial health issues (think GMOs, organic food, vaccines, etc) that has gone nowhere. I’ve toyed with Twitter’s API and set up Postgress, but there is so much work to do.
  • I’ve had a Fitbit since March, and I think it may soon be time to download the data and become a Quantified Self follower.
  • I’ve been itching to create some form of algorithm that matches mentors to a team of mentees. I got this idea through reading about the National Resident Matching Program and after hearing about the Mentorship program through the Harris School.
  • Doing some research on technology policy, including privacy, technology education, ethics, technology in use in the future, etc.
  • Start considering data science internships for Summer 2016.

I’d love feedback on some of these topics!

I experienced my first official days as a programmer (something I never strived to be). My current project for the INN Nerds team is a Django application to display a database table with a nice design and include search/filter functionality. Sounds easy, right? Well, as a Django noob, this project proved challenging. I encountered so many new issues and errors that I was close to breaking down in tears many times.

On my own, I was able to tackle most of the simple errors (missing commas, missing modules) but got stuck on the more advanced errors, for which I relied on Google and the better programmers around me.

Here are some things I learned:

  1. Everything comes together: Django puts in practice working with the database as well as designing. You’re a front-end and a back-end developer all in one day.
  2. It’s more important to be able to read, interpret, and reuse another programmer’s code than to write your own. Most functionality you’ll ever need has already been written. If you can find it, use it!
  3. Documentation is key. Good documentation is to die-for, and the Django documentation is the best I have seen. At first I found it a bit boring to write docs, but once you see how someone else benefits from good documentation, and once you yourself benefit from someone’s good deed, then you value it.
  4. Learn to ask for help. Google is great for many things, but nothing beats having someone to explain things to you. I admin, with most people I feel ashamed to ask the dumb questions, but if I have one or two close individuals with more experience, I rely on them to ask the silly noob things. I also was able to Tweet some of my issues and got great replies!

As far as my experience with Django, it is the first framework I feel comfortable with. Even as a beginner, and even without understanding many of the concepts behind Django, I feel more comfortable with it than with a framework like WordPress.

Just a few days ago I found the Twitter Account @remotedatasci and its site. I tweeted the account holder a very important question: is remote work suitable (and available) for junior data scientists and even interns?

I guess remotedatascience thought it was a good enough question to create a blog post about it. I certainly think so, and not simply because that’s the position I think I could have at this point with my beginner’s knowledge of machine learning and more intermediate work in statistics.

Simply, I think remote work is a lifestyle choice and not something that employers should discriminate against. I’ve read plenty of posts of people worried about what employers would think about remote work - even one (sorry, can’t find the link) asking if he/she should expect a lower salary for working remotely (some people said yes…) I simply cannot agree with this.

Not only is working remotely a work-lifestyle choice (like deciding where to live, what to wear), but - if done correctly - can produce better, more efficient work. As remotedatascience mentions in the blog post, a dedicated worker will be just as dedicated remotely. Many people who work remotely even feel that they have to work harder and be more “present” to make up for not being at the office. Many articles and blogs cover this topic, so I won’t get into that specifically.

But, I can say that as someone who is currently working as a remote contractor with a tech team, the setup could not be more ideal. Working remotely encourages us to be on top of communication. I even feel less guilty about asking questions as I would if I had to walk to someone’s office and interrupt their work. With chat and email, a person can answer at his/her own time, and I rarely have to wait for someone to help me with something. Moreover, the setup is respectful of my needs. I suffer from frequent and severe back pain, and it requires a special standing desk setup that I can use to sit and stand interchangeably. I also have to lie down frequently to relieve pain. Any understanding employer should be accommodating - but I have encountered some who have not been so. And quite frankly, how I take my breaks - whether I go to a coffee shop or take a nap - should not be an employer’s business. I get my work done, communicate often, and hopefully am a productive, useful, and friendly team-member. What else do you need from me?

Finally, to address the idea of remote junior data science positions, I get it when people say that when you are just learning, it’s really good to be there in person. I get it, but most of MY learning has also been researching solutions to problems online and asking questions about it - all of which I can do from anywhere in the world. If the person I am learning from is willing to be available via any mode of communication, there is no reason why I can’t ask questions virtually. For data science in particular, if I am not familiar with the method we are using, why not send me links to a MOOC about the method? Why not schedule a Skype meeting to teach me how to use it? No reason why a data science intern cannot help out with data cleaning and prep (that’s what I would be doing in the office anyway…)

As I prepare to start my MS this fall, I’d be ecstatic to work with a data science team looking for a part-time remote contractor. If your team embraces remote work and loves data, please don’t hesitate to contact me. I’d love to learn from you!

One of the key parts of data science, particularly in social science, is visualization. I think our data and analysis is meaningless without being able to communicate it and use it for policy change. This last week I began working for INN’s nerds team, and one of my first tasks has been to work with those member organizations looking to visualize their work.

INN offers members technology services, including help transforming their news and investigative websites with Largo, a WordPress framework INN nerds built for news organizations. So, in addition to recommending data viz tools that are easy to use and interactive, our members also need tools that are WordPress compatible.

This post is not about a tool that many of our members use. In fact, I’m going to write a post about tools I would recommend for our members, especially those who do not have a developer on hand. This post is about Wp-D3, the WordPress plugin for d3.js. I was so excited to learn that this plugin exists! D3 has become the go-to data viz tool - for those who are able to work with javascript, anyway. I am in the process of learning how to work with d3.js, so I followed this tutorial to learn how to use Wp-D3.

Forthcoming: a detailed post on more data viz tools.