Neustar at the 2015 Grace Hopper Celebration of Women in Computing

Author’s Note: Hello readers! I’m Julie Hollek, and I am a data scientist at Neustar, where I focus on understanding questions around identity. Before joining Neustar, I received my PhD in astronomy from the University of Texas at Austin, where I studied the chemical composition of the oldest stars in our galaxy.

This past October, I presented a Student Opportunity Lab (SOL) at the Anita Borg Institute’s Grace Hopper Celebration of Women in Computing (GHC) about my transition from graduate student in astronomy to data scientist at Neustar.  My talk, “Old Stars to Neustar: Academia Lessons Translated to Data Science”, explored four foundational skills I learned through my academic training that are most helpful to my current role as a data scientist: networking, problem solving, communication, and grit.

It’s interesting that, specifically, working through the downtime in my academic career –difficult collaborations, the research-teaching-study-life balancing act, and more– helped me develop grit. This has directly transferred not only to working in industry, but to data science in particular. Data scientists are often a minority amongst technologists, where being a woman in this field means being a minority within a minority. Thus, grit is all the more important to take a complex idea from inception to completion in a field that does not necessarily reflect your experiences.

Nearly half the GHC attendees are still in college, and SOLs like this provide them with invaluable access to industry professionals who discuss different aspects of working in technology.  For me, presenting this SOL was inspiring because I was able to connect with students from across the globe.  Encouraging young people into STEM careers and engaging with highly enthusiastic students is infectious both were conference highlights.

Since this was my first time attending Grace Hopper Conference, I was also excited to learn from women technologists showcasing cutting-edge work.  Technical talks at GHC serve the dual purpose of allowing these technologists to promote their work and enabling attendees of all levels to learn about new and interesting subjects. There were several tracks that were especially relevant to the Neustar Research group, including big data, internet of things, and (most pertinent) data science. Data science as a field is exploding, as witnessed by the popular track devoted specifically to the topic at GHC. One especially outstanding talk was that of Anita Mehrotra who gave a talk entitled “Virality at Buzzfeed”.

black? blue? gold? white?

The talk was framed around an article polling readers on which colors they saw in “The Dress“, a meme that originated on Tumblr. The Buzzfeed post quickly went viral and resulted in the largest traffic ever seen on the site. Buzzfeed’s business model relies on views and clicks, but is also very invested in user shares, where readers use social media platforms to distribute articles to their networks. This generates over 75% of their traffic. By using the Susceptible-Infected-Recovered model from biostatistics, data scientists can characterize the “virality” of a post, or essentially the ratio of shared views to seed views. Those metrics are then used to optimize performance. Furthermore, by evaluating viral posts using basic graph theory principles, they can answer questions such as over which platforms did the sharing occur (edge attributes)? Did large numbers of people share the post or did a few individuals share it with a lot of other people (tree width)? 

“The Dress” talk highlights the importance of GHC.  This conference is unique in that it offers a way for women to lead the discourse about emerging technologies. The talk showcased novel data science techniques and tools as told through the context of a popular meme.  GHC provided an abundance of interesting topics for further research and showed new and insightful analytics that can be applied to a number of business questions here at Neustar.  And all this while maintaining diversity and inclusion as GHC’s core mission, something that is also at the heart of what we’re trying to do at Neustar.

 

Neustar at SIAM’s SODA 2015

Author’s Note: Hello readers! I’m Sonya Berg, and this is my first post on Neustar’s research blog. I am a data scientist at Neustar Research, where my focus is on solving big data problems in advertising. Before joining Neustar, I completed a PhD in mathematics from UC Davis, where I developed algorithms for quantum computing.

Matt Curcio, Rafael Solari and I had the pleasure of starting our new year off by attending SIAM’s Symposium on Discrete Algorithms, or SODA, conference in sunny San Diego. SODA is a computer science theory conference, with most talks given by academics. Below I highlight two examples of high-level topics we were excited as practitioners to see covered by theorists at SODA.

  1. Dataset privacy.  In a variety of industries, publicly releasing datasets, or even just statistics about datasets, has proved to be useful for R&D. However, there is general concern that publishing data about individuals can compromise personal privacy. For example, last summer we highlighted breaches of privacy in a published dataset. At the SODA conference we learned about advancements in privacy research, including how to privately conduct supervised learning, as well as the application of privacy mechanisms to discourage lying.
  2. Distributed computing. As a big data shop, we use various distributed computing systems, such as Hadoop, Hive, and Amazon’s Redshift. We’ve seen the ugly sides of parallelism, such as clusters bottlenecking on intra-cluster bandwidth during repartitioning. We were glad to learn at SODA that academics are responding to this reality by exploring theoretical approaches to minimizing bandwidth complexity.

Learning directly from the top computer science researchers in the world can certainly be a challenge. But, we continue to attend conferences such as SODA because we return home inspired to apply new theoretical concepts to our most interesting R&D problems.

San Diego Sunset

Beautiful San Diego