Creating Successful Data Science Projects

Successful projects get you ahead, whether it’s building networks or getting jobs. Having eye-catching output is the ultimately way to cut through the crowd and stand out to decision makers.

Recently, I went to the University of Connecticut Sports Analytics Symposium, a day-long conference that brought together enthusiasts like myself with academics firmly established in the sports research field. Being around these folks is truly inspiring and taught me a lot I can apply not only to my passions but my day job.

ucsas_panel
During a panel discussion, I heard from professionals and students who won Kaggle’s National Football League Big Data Bowl, a data science competition that has launched careers and gotten people their dream jobs. The panel included Brendan Kumagai (who now interns at sports intelligence platform Zelus Analytics), Asmae Toumi (who works in health care but is an established sports analytics hobbyist in her free time), and Megan Risdal (a product manager at Kaggle who has witnessed many successful data science projects).

Their advice applies to any project, not just related to sports. Here’s what I learned.

Find your team

Kumagai and Toumi both emphasized that their successful NFL Big Data Bowl submissions would not be possible without their teams.

Be active within your community of interest, especially on platforms like Twitter. People are open to chatting and working with like-minded peers.
• Once you find your team, set the bar of excellence you want to achieve. Agreeing on this standard will guide the amount of work your team should put in.
Check in frequently and aim for small amounts of progress at each checkpoint.
• Just like machine learning models are “ensembled” together, ensemble a team with defined roles such as statistician, programmer, and project manager. The PM ensures that tasks are getting done, and individuals can specialize in areas they are stronger in (coding, math, etc.).

bigdatabowltweets2cropped

Ask for help

As the saying goes, “If you want to go fast, go alone, if you want to go far, go together.”

Ask subject matters for help. For example, Toumi’s team reached out to football coaches on Twitter to ask them about game tactics, which in turn allowed them to better model out the game situations in the data.
Make your questions specific so they can provide quick and direct answers. Busy people don’t have time to review everything and provide thorough feedback.
Use existing platforms like Kaggle discussion sections. These platforms are a great way to get quick feedback from individuals working on the same problem, allowing for immediate iteration.

nflbigdatabowlkagglediscussion

Steal other’s ideas

As someone once said, “Good artists borrow, great artists steal.” If you’re working on a problem in an established space, feel free to use features or model approaches others have already created. Save the time of reinventing the wheel, and instead fine tune the existing work.

As Kaggle’s own Risdal noted, the best approach is often to combine existing ideas together. Ensemble models, meaning the combined output of multiple models (often fed into each other), usually do well in Kaggle competitions.

Excite your reader

Last but certainly not least, put yourself in your reader’s shoes. Most of them don’t have a PhD level of mathematics, don’t care about academic literature, and have a limited attention span.

What would excite someone in this field? Think about visuals and animations that draw the eye and ultimately sell the project.

nflbigdatabowlpuntreturn
Ultimately, remember that the real goal is learning and personal growth. Now, what are you waiting for? Go out and start doing.

Leave a Reply

Your email address will not be published. Required fields are marked *