Grow your data science skills at posit::conf(2024)

August 12th-14th in Seattle

16 Feb 2023

Achieving scalability & showing value of community

Regis James

Senior Manager (Biopharmaceutical Data Science) at Regeneron Pharmaceuticals
Join us with Regis James as we discuss writing code to develop, optimize, and integrate bioinformatic pipeline efforts that result in the generation of impactful and user-friendly scientific business decision support tools.
Watch this hangout

Episode notes

We were joined by Regis James, Senior Manager (Biopharmaceutical Data Science) at Regeneron Pharmaceuticals.


Achieving scalability through standardization (19:04)


In order to achieve scalability I do have to standardize things a bit. We use Confluence, so on the wiki that I’ve made, I have a lockdown page where I have some standardized messages that I send out to people at different phases of things.


I do customize and read what people say about themselves and I’m thinking, “this could be useful here, that could be useful there.”


One of the things that I do is– I have this whole ingestion pipeline that I maintain to identify new members of the community and who can help the community learn more about the blockers to current responsibilities they have for their job.


All companies have a directory where you can learn about who exists at the company and see people’s job titles.


I can look into the directory for data or data science or whatever and see if the people that come out in the results are already in my Microsoft Teams group because they could be potential new people who could benefit from being in the community.


They might be able to do their job better and also help other people do their job better, if they can connect to others who have similar needs or can offer similar things.


I do this on a semi-regular basis, so I can really help onboard people who have just joined the company or maybe moved into a data science type role.


Then I reach out to these people, and use my standardized message like, “Hey, I just want you to know this group exists. This community was built just for people like you. This is what we do and here’s a link to the wiki that has the links of all the recordings” – because we have a growing library of all the conversations just like the Hangouts.


I only send one email and ask people to let me know if they’re interested in it. I don’t spam people.


Of the subset, who answer – sometimes it’s like 20% that respond, they say, “Yes, this is amazing. Thank you for letting me know. I just came to the company. I didn’t even know if there was a community.” I’m also doing it from my own perspective of when I joined the company. 


When they respond and then they say they’re interested, I add them to the Teams group, to the Active Directory group, and then forward the invitation to the ongoing gatherings every other Friday.


In the invitation text I paste in, “Welcome to the group. Please introduce yourself in the introduction section” and include a link that takes them directly into not just the Team for the group, but the channel of introductions.


Since doing this, I’ve gotten a much higher percentage of people who have been added to the group to start introducing themselves and say what they work on.


I see what people are working on, and integrate that with my understanding of what people need help with from the Shiny app (the poll for what talks people are interested in.)


Then I reach out to them and say, “Hey, would you be interested in being one of the guests for an upcoming event?” I send them the link and they can see there’s no pressure because I’m a year out. I’ve got someone every other week. So it’s like, Hey, would this be something you’re interested in and then I am gently having a conversation over time.


Sometimes people drop out and I just reach out to those people and ask if they would be able to switch.


How do you show the value of community? (40:14)


To prove the contribution, the benefit of a community – I use the words stuck and unstuck because that’s really the only thing the business, or any organization cares about.


There’s a bunch of stuff that the organization has to do. There are obstacles in the way, they are stuck. How do they get unstuck? 


That’s what data science is for, really. You can argue that it’s also for going from stuck to unstuck.


The non-computational people who have the power to bestow upon you promotions, more money, or more influence over the company to make other decisions, really care about the value of the institution.


If you can show how the existence of that community propagates through the network that exists in the institution so that something goes from stuck to unstuck and you can mark that, you can measure it, then you can say this only happened because of me.


If you’re a community that’s gathered around pulling thorns out of your colleagues arms, you can say, we pulled 15 thorns out or we taught another group how to pull 100 thorns out because we were force multipliers.


If you say, “what do you hate about your job?” to the person with power that I need to prove the community’s value to, think about how you’re reducing those pain points for those people and enabling them to go from stuck to unstuck. 


Collect those things and showcase them. From the get go, you should be thinking about how to indicate success to leadership.


Full Speaker Bio:


At Regeneron Pharmaceuticals, a world-class developer of therapeutic biologics, Regis works as a full-stack data scientist to make medicine and optimize clinical trial logistics by collaborating with colleagues to bring structure to and extract meaning from biological data, helping illuminate non-obvious underlying relationships.


He initiates novels and facilitates existing projects towards the accelerated extraction of actionable biological and therapeutic insights. He personally, and collaboratively with fellow scientists, writes code to develop, optimize, and integrate bioinformatic pipeline efforts that result in the generation of impactful and user-friendly scientific business decision support software tools. Additionally, he coordinates the advising of other data scientists in the development of their own novel bioinformatic webtools and the support of the computational platforms on which this work is accomplished.


Regis achieves these ends by working as both a data scientist and web developer, including leveraging the statistical and graphical power of the R programming, its interactive Shiny web framework, multi-omic ontologies, and both relational and graph databases to build tools that help him and collaborators to demand more from their data, in an effort to develop and realize therapeutic rationales toward drug development. This requires familiarity with complex data sets and ever-evolving data science workflows (conceptualizing relationships, constructing analytical pipelines, querying, addressing discrepancies and heterogeneity in format, transforming, integrating, programming and implementing analysis, graphically representing, communicating results to collaborators, etc.).


In addition, during his tenure as a PhD student, he was the primary developer of an algorithm and companion web application (OMIM Explorer) for rapid, cost-effective, and highly accurate genome-wide genetic diagnostics and disease gene discovery. OMIM Explorer is, in effect, a software prototype demonstrating possibilities for computational approaches to personalized/precision medicine.


Subscribe to more inspiring open-source data science content.

We love to celebrate and help people do great data science. By subscribing, you'll get alerted whenever we publish something new.