Data Science
Seeking Advice on Transitioning to FAANG Roles with Diverse Data Experience
To give a bit of background, I have nearly five years of experience in the data field, distributed as follows: 40% in analytics, 40% in data science, and 20% in machine learning engineering. Throughout this time, I've worked for large non-FAANG companies, gaining experience in building pipelines, conducting time series analyses, classification, regression, and deploying models via APIs. While some of my projects haven't succeeded due to external factors, some have started delivered substantial value.
Educationally, I hold a master's degree in data science from a reputable institution. Given the breadth of my experiences, I'd describe myself as a data generalist.
Currently, I'm exploring the job market for intermediate to senior roles in data science and MLE. With the influx of FAANG professionals in the market, standing out has become a challenge. While I'm not in a rush to exit my present position, I'm growing concerned about my diverse experience potentially limiting me. I worry that without focusing on a specific niche, like classification or regression, I might face challenges transitioning to a FAANG company, especially as I aspire to join at a non-entry level.
Has anyone with a similar background navigated this? I'd appreciate insights on the best strategies to position myself optimally for opportunities in big tech.
submitted by /u/DSby2021
[link] [comments]
Seeking Advice on Transitioning to FAANG Roles with Diverse Data Experience
To give a bit of background, I have nearly five years of experience in the data field, distributed as follows: 40% in analytics, 40% in data science, and 20% in machine learning engineering. Throughout this time, I've worked for large non-FAANG companies, gaining experience in building pipelines, conducting time series analyses, classification, regression, and deploying models via APIs. While some of my projects haven't succeeded due to external factors, some have started delivered substantial value.
Educationally, I hold a master's degree in data science from a reputable institution. Given the breadth of my experiences, I'd describe myself as a data generalist.
Currently, I'm exploring the job market for intermediate to senior roles in data science and MLE. With the influx of FAANG professionals in the market, standing out has become a challenge. While I'm not in a rush to exit my present position, I'm growing concerned about my diverse experience potentially limiting me. I worry that without focusing on a specific niche, like classification or regression, I might face challenges transitioning to a FAANG company, especially as I aspire to join at a non-entry level.
Has anyone with a similar background navigated this? I'd appreciate insights on the best strategies to position myself optimally for opportunities in big tech.
submitted by /u/DSby2021
[link] [comments]
Reddit
From the datascience community on Reddit
Explore this post and more from the datascience community
Data Science
What's your educational background
Hi r/datascience. I am interested to know the educational qualifications/background of the members of the group. Personally I have a Bachelor's degree in Maths + an MBA. Have been working in Banking + Analytics for the last 12 years. I know we have CS graduates in this group and those who have done MS in data science and Analytics. Would be good to know the diverse educational background of others as well.
submitted by /u/LowLab2791
[link] [comments]
What's your educational background
Hi r/datascience. I am interested to know the educational qualifications/background of the members of the group. Personally I have a Bachelor's degree in Maths + an MBA. Have been working in Banking + Analytics for the last 12 years. I know we have CS graduates in this group and those who have done MS in data science and Analytics. Would be good to know the diverse educational background of others as well.
submitted by /u/LowLab2791
[link] [comments]
Reddit
From the datascience community on Reddit
Explore this post and more from the datascience community
Data Science
What type of model(s) would you recommend using for time series predictions including volatility validations?
In my experience, ARIMA always did a decent job. Never saw any need to resort to LSTM. Anyone care to shear their experience?
submitted by /u/Difficult-Big-3890
[link] [comments]
What type of model(s) would you recommend using for time series predictions including volatility validations?
In my experience, ARIMA always did a decent job. Never saw any need to resort to LSTM. Anyone care to shear their experience?
submitted by /u/Difficult-Big-3890
[link] [comments]
Reddit
From the datascience community on Reddit
Explore this post and more from the datascience community
Data Science
Identifying time series patterns advice
Hey you guys, I have something I am stuck at and need your advice.
Long story shirt in example: Customer A: likes to buy at the beginning of the month only Customer B: likes to buy at the end of each week when visited by an agent because he stocks Customer C: likes to buy at the beginning, middle and end of the month.
And so on, you kinda get the problem.
I want to be able to identify this and I was thinking of a possible solution but I think it lacks experience: Decompose the seasonal component of each retailer’s time series and then cluster retailers whom purchasing seasonal components are similar with kmeans?
If you think this approach is invalid, please feel free to suggest something I could read.
Thanks.
submitted by /u/Careful_Engineer_700
[link] [comments]
Identifying time series patterns advice
Hey you guys, I have something I am stuck at and need your advice.
Long story shirt in example: Customer A: likes to buy at the beginning of the month only Customer B: likes to buy at the end of each week when visited by an agent because he stocks Customer C: likes to buy at the beginning, middle and end of the month.
And so on, you kinda get the problem.
I want to be able to identify this and I was thinking of a possible solution but I think it lacks experience: Decompose the seasonal component of each retailer’s time series and then cluster retailers whom purchasing seasonal components are similar with kmeans?
If you think this approach is invalid, please feel free to suggest something I could read.
Thanks.
submitted by /u/Careful_Engineer_700
[link] [comments]
Reddit
From the datascience community on Reddit
Explore this post and more from the datascience community
Data Science
Python library to interactively filter a dataframe?
For all intents and purposes its basically a Power BI table with slicers/filters, or a GUI approach of df[(mask1) & (mask2) & (mask3)].sort_values(by='col1') where you can interact with which columns to mask, how to mask them, and how to sort, resulting in a perfectly tailored table.
I have scraped a list of every game on Steam and I have a dataframe of like 180k games and 470+ columns and was thinking how cool it would be if I could make every a table as granular as I want it. e.g. find me games from 2008 that have 1000 total ratings and more than 95% steam review with the tag "FPS" sorted by the date it came out, and hide the majority of columns.
If something like this doesnt exist but is able to exist in something like Flask (that I have NO knowledge on), let me know. I just wanted to check if the wheel exists before rebuilding it. If what I want really is difficult to do, let me know and I can just make the same thing in Power BI. This will also make me appreciate Power BI as a tool.
submitted by /u/lowkeyripper
[link] [comments]
Python library to interactively filter a dataframe?
For all intents and purposes its basically a Power BI table with slicers/filters, or a GUI approach of df[(mask1) & (mask2) & (mask3)].sort_values(by='col1') where you can interact with which columns to mask, how to mask them, and how to sort, resulting in a perfectly tailored table.
I have scraped a list of every game on Steam and I have a dataframe of like 180k games and 470+ columns and was thinking how cool it would be if I could make every a table as granular as I want it. e.g. find me games from 2008 that have 1000 total ratings and more than 95% steam review with the tag "FPS" sorted by the date it came out, and hide the majority of columns.
If something like this doesnt exist but is able to exist in something like Flask (that I have NO knowledge on), let me know. I just wanted to check if the wheel exists before rebuilding it. If what I want really is difficult to do, let me know and I can just make the same thing in Power BI. This will also make me appreciate Power BI as a tool.
submitted by /u/lowkeyripper
[link] [comments]
Reddit
From the datascience community on Reddit
Explore this post and more from the datascience community
Data Science
How have you approached training yourself to become better at business acumen/context for your DS work?
This is the thing I'm struggling most with. Coming from an academic background, the concerns seem to be different but I'm still having trouble articulating exactly how, or what to do to get better at training myself to be more business-ybif that makes sense
submitted by /u/AnxiousEgg6284
[link] [comments]
How have you approached training yourself to become better at business acumen/context for your DS work?
This is the thing I'm struggling most with. Coming from an academic background, the concerns seem to be different but I'm still having trouble articulating exactly how, or what to do to get better at training myself to be more business-ybif that makes sense
submitted by /u/AnxiousEgg6284
[link] [comments]
Reddit
From the datascience community on Reddit
Explore this post and more from the datascience community
Data Science
Weekly Entering & Transitioning - Thread 30 Oct, 2023 - 06 Nov, 2023
Welcome to this week's entering & transitioning thread! This thread is for any questions about getting started, studying, or transitioning into the data science field. Topics include:
* Learning resources (e.g. books, tutorials, videos)
* Traditional education (e.g. schools, degrees, electives)
* Alternative education (e.g. online courses, bootcamps)
* Job search questions (e.g. resumes, applying, career prospects)
* Elementary questions (e.g. where to start, what next)
While you wait for answers from the community, check out the FAQ and Resources pages on our wiki. You can also search for answers in past weekly threads.
submitted by /u/AutoModerator
[link] [comments]
Weekly Entering & Transitioning - Thread 30 Oct, 2023 - 06 Nov, 2023
Welcome to this week's entering & transitioning thread! This thread is for any questions about getting started, studying, or transitioning into the data science field. Topics include:
* Learning resources (e.g. books, tutorials, videos)
* Traditional education (e.g. schools, degrees, electives)
* Alternative education (e.g. online courses, bootcamps)
* Job search questions (e.g. resumes, applying, career prospects)
* Elementary questions (e.g. where to start, what next)
While you wait for answers from the community, check out the FAQ and Resources pages on our wiki. You can also search for answers in past weekly threads.
submitted by /u/AutoModerator
[link] [comments]
Reddit
From the datascience community on Reddit
Explore this post and more from the datascience community
Data Science
Maintaining a work life balance
How is everyone keeping a good work life balance in this industry? Or work in general.
I am currently doing my masters as a full time DS and also doing certifications as requested by my managers.
I am forcing myself to sleep earlier, but the daily screen time is just too draining for my eyes to keep up.
submitted by /u/EstablishmentHead569
[link] [comments]
Maintaining a work life balance
How is everyone keeping a good work life balance in this industry? Or work in general.
I am currently doing my masters as a full time DS and also doing certifications as requested by my managers.
I am forcing myself to sleep earlier, but the daily screen time is just too draining for my eyes to keep up.
submitted by /u/EstablishmentHead569
[link] [comments]
Reddit
From the datascience community on Reddit
Explore this post and more from the datascience community
Data Science
Recommendation for measuring similarity of paragraphs
I’m doing some analysis and part of my data, possibly a very important part, is a text description of a product. I want to determine if there’s a correlation between the product description and performance, but to do this I need to cluster the descriptions into similar groups. I’m thinking text embeddings could be useful, but I’m unsure of which ones to use. Can anyone provide some advice?
Possibly more important, if I’m completely barking up the wrong tree, please let me know.
submitted by /u/Hot-Profession4091
[link] [comments]
Recommendation for measuring similarity of paragraphs
I’m doing some analysis and part of my data, possibly a very important part, is a text description of a product. I want to determine if there’s a correlation between the product description and performance, but to do this I need to cluster the descriptions into similar groups. I’m thinking text embeddings could be useful, but I’m unsure of which ones to use. Can anyone provide some advice?
Possibly more important, if I’m completely barking up the wrong tree, please let me know.
submitted by /u/Hot-Profession4091
[link] [comments]
Reddit
From the datascience community on Reddit
Explore this post and more from the datascience community
KDnuggets
Thought Propagation: An Analogical Approach to Complex Reasoning with Large Language Models
Thought Propagation is a prompt engineering technique that instructs LLMs to identify and tackle a series of problems that are similar to the original query, and then use the solutions to these similar problems to either directly generate a new answer or formulate a detailed action plan that refines the original solution.
Thought Propagation: An Analogical Approach to Complex Reasoning with Large Language Models
Thought Propagation is a prompt engineering technique that instructs LLMs to identify and tackle a series of problems that are similar to the original query, and then use the solutions to these similar problems to either directly generate a new answer or formulate a detailed action plan that refines the original solution.
KDnuggets
Thought Propagation: An Analogical Approach to Complex Reasoning with Large Language Models
Thought Propagation is a prompt engineering technique that instructs LLMs to identify and tackle a series of problems that are similar to the original query, and then use the solutions to these similar problems to either directly generate a new answer or…
insideBIGDATA
Kickstart Your Business to the Next Level with AI Inferencing
https://insidebigdata.com/wp-content/uploads/2023/06/AI_shutterstock_2287025875_special-1.jpg The need to accelerate AI initiatives is real and widespread across all industries. The ability to integrate and deploy AI inferencing with pre-trained models can reduce development time with scalable secure solutions that would revolutionize how easily you can capture, store, analyze, and use data to be more competitive.
➖ Sent by @TheFeedReaderBot ➖
Kickstart Your Business to the Next Level with AI Inferencing
https://insidebigdata.com/wp-content/uploads/2023/06/AI_shutterstock_2287025875_special-1.jpg The need to accelerate AI initiatives is real and widespread across all industries. The ability to integrate and deploy AI inferencing with pre-trained models can reduce development time with scalable secure solutions that would revolutionize how easily you can capture, store, analyze, and use data to be more competitive.
➖ Sent by @TheFeedReaderBot ➖
insideBIGDATA
Kickstart Your Business to the Next Level with AI Inferencing
The need to accelerate AI initiatives is real and widespread across all industries. The ability to integrate and deploy AI inferencing with pre-trained [...]
Data Science
Favorite ML Example?
I feel like a lot of kaggle examples use really simple data sets that you don’t ever find in the real world scenarios(like the Titanic data set for instance).
Does anyone know any notebooks/examples that start with really messy data? I really want to see someone go through the process of EDA/Feature engineering with data sets that have more than 20 variables.
submitted by /u/Throwawayforgainz99
[link] [comments]
Favorite ML Example?
I feel like a lot of kaggle examples use really simple data sets that you don’t ever find in the real world scenarios(like the Titanic data set for instance).
Does anyone know any notebooks/examples that start with really messy data? I really want to see someone go through the process of EDA/Feature engineering with data sets that have more than 20 variables.
submitted by /u/Throwawayforgainz99
[link] [comments]
Reddit
From the datascience community on Reddit
Explore this post and more from the datascience community
Data Science
Are all higher level data science jobs like this?
I'm really not sure how to summarize this concisely in a neat title, so just let me explain.
At previous lower level jobs, we were organized. We had ticketing tracking systems, step-by-step procedures for all of the commonly done work, we had checklists that people could sign off on as they completed work. And most importantly, even for one-off requests, the primary mode of communication was email. That way, I had the project specifications and/or updates spelled out in front of me that I could refer back to whenever needed.
As I get higher up in the field at different companies, I'm finding the primary mode of communication is virtual meetings. All of the background, specifications, and next steps are given verbally, and I'm sitting here in these meetings furiously trying to write everything down that is being said. What's worse is that the ideas for the projects often aren't fully developed and we have to figure them out so I get a lot of "do this, actually no, let's do it this way, but I'm actually thinking it would be better to approach it this way.....". AS you can imagine it makes fully understanding the next steps of a given projects difficult. If I use my judgement and approach it the way I feel is best, half the time it's end up not being what management wants and I have to waste their time and mine on rework.
One of the ways I tried to work around management's brain dumps on me was to recap back to them what the next steps they wanted from me were, but they're super busy so they always join the meetings late, and as a result we frequently run out of time. 75% of the time I try to message or email them with questions they just don't respond, so the only way I can get any info out of them is via virtual meetings. This is creating an environment for me that makes mistakes easier to happen, and it's turning into a situation where I can do 9 things right, but if I missed or misunderstood the 10th thing, I'm getting crucified for it (meanwhile this is a common occurrence for management but that's a different rant.....) I'm being made to feel like it's a shortcoming of mine for not being able to take down everything accurately.
I know some people can thrive in these conditions. For me, it's tough. I'm definitely a scatterbrain and I try to compensate for this by being as organized as humanly possible, but it's just easier said than done when most everything is being given ONLY verbally. I understand that the higher you go in data science, the less routine and the more exploratory and R&D your work becomes, so having clearly documented procedures becomes less realistic. But if this is the way most of these positions are going to be, I really don't feel like this field is for me.
submitted by /u/son_of_tv_c
[link] [comments]
Are all higher level data science jobs like this?
I'm really not sure how to summarize this concisely in a neat title, so just let me explain.
At previous lower level jobs, we were organized. We had ticketing tracking systems, step-by-step procedures for all of the commonly done work, we had checklists that people could sign off on as they completed work. And most importantly, even for one-off requests, the primary mode of communication was email. That way, I had the project specifications and/or updates spelled out in front of me that I could refer back to whenever needed.
As I get higher up in the field at different companies, I'm finding the primary mode of communication is virtual meetings. All of the background, specifications, and next steps are given verbally, and I'm sitting here in these meetings furiously trying to write everything down that is being said. What's worse is that the ideas for the projects often aren't fully developed and we have to figure them out so I get a lot of "do this, actually no, let's do it this way, but I'm actually thinking it would be better to approach it this way.....". AS you can imagine it makes fully understanding the next steps of a given projects difficult. If I use my judgement and approach it the way I feel is best, half the time it's end up not being what management wants and I have to waste their time and mine on rework.
One of the ways I tried to work around management's brain dumps on me was to recap back to them what the next steps they wanted from me were, but they're super busy so they always join the meetings late, and as a result we frequently run out of time. 75% of the time I try to message or email them with questions they just don't respond, so the only way I can get any info out of them is via virtual meetings. This is creating an environment for me that makes mistakes easier to happen, and it's turning into a situation where I can do 9 things right, but if I missed or misunderstood the 10th thing, I'm getting crucified for it (meanwhile this is a common occurrence for management but that's a different rant.....) I'm being made to feel like it's a shortcoming of mine for not being able to take down everything accurately.
I know some people can thrive in these conditions. For me, it's tough. I'm definitely a scatterbrain and I try to compensate for this by being as organized as humanly possible, but it's just easier said than done when most everything is being given ONLY verbally. I understand that the higher you go in data science, the less routine and the more exploratory and R&D your work becomes, so having clearly documented procedures becomes less realistic. But if this is the way most of these positions are going to be, I really don't feel like this field is for me.
submitted by /u/son_of_tv_c
[link] [comments]
Reddit
From the datascience community on Reddit
Explore this post and more from the datascience community
Data Science
What is the best way to access computation power for a pet project on small LMMs and BERT fine-tuning, without spending a fortune?
I'm a data scientist with a pet project that could turn into something more, but I need more computation power. I have a PC with an RTX 2060 SUPER, but it's getting old. I'm considering Colab Pro+, but I prefer to work with VS Code and build my projects as folders rather than notebooks. I've also explored cloud options, but they seem expensive. My last resort is to buy a refurbished 16GB V100, but I'm hoping to find a more affordable solution.
submitted by /u/David202023
[link] [comments]
What is the best way to access computation power for a pet project on small LMMs and BERT fine-tuning, without spending a fortune?
I'm a data scientist with a pet project that could turn into something more, but I need more computation power. I have a PC with an RTX 2060 SUPER, but it's getting old. I'm considering Colab Pro+, but I prefer to work with VS Code and build my projects as folders rather than notebooks. I've also explored cloud options, but they seem expensive. My last resort is to buy a refurbished 16GB V100, but I'm hoping to find a more affordable solution.
submitted by /u/David202023
[link] [comments]
Reddit
From the datascience community on Reddit
Explore this post and more from the datascience community
KDnuggets
Data Warehouses vs. Data Lakes vs. Data Marts: Need Help Deciding?
A comparative overview of data warehouses, data lakes, and data marts to help you make informed decisions on data storage solutions for your data architecture.
Data Warehouses vs. Data Lakes vs. Data Marts: Need Help Deciding?
A comparative overview of data warehouses, data lakes, and data marts to help you make informed decisions on data storage solutions for your data architecture.
KDnuggets
Data Warehouses vs. Data Lakes vs. Data Marts: Need Help Deciding?
A comparative overview of data warehouses, data lakes, and data marts to help you make informed decisions on data storage solutions for your data architecture.
KDnuggets
Top 10 AI Startups to Work for in India
Want to jumpstart your career in AI? Discover the top 10 trailblazing Indian startups that are reshaping industries with cutting-edge innovation and providing unparalleled opportunities to work on impactful projects.
Top 10 AI Startups to Work for in India
Want to jumpstart your career in AI? Discover the top 10 trailblazing Indian startups that are reshaping industries with cutting-edge innovation and providing unparalleled opportunities to work on impactful projects.
KDnuggets
Top 10 AI Startups to Work for in India - KDnuggets
Want to jumpstart your career in AI? Discover the top 10 trailblazing Indian startups that are reshaping industries with cutting-edge innovation and providing unparalleled opportunities to work on impactful projects.
Data Science
How does one find freelance or contract work? Short or long term would be fine.
I work full time as a data scientist and I have 3 years experience now. I've become significantly more efficient and experienced and I feel that I could take on more work than my company gives me. My boss wouldn't mind if I took some extra work on the side, he's very flexible and I was wondering how people find contracts for short term gigs? Are there any sites in particular people have had success with? What do you typically bill at?
submitted by /u/Unhappy_Technician68
[link] [comments]
How does one find freelance or contract work? Short or long term would be fine.
I work full time as a data scientist and I have 3 years experience now. I've become significantly more efficient and experienced and I feel that I could take on more work than my company gives me. My boss wouldn't mind if I took some extra work on the side, he's very flexible and I was wondering how people find contracts for short term gigs? Are there any sites in particular people have had success with? What do you typically bill at?
submitted by /u/Unhappy_Technician68
[link] [comments]
Reddit
From the datascience community on Reddit
Explore this post and more from the datascience community
Data Science
Where to Draw the Line between Proof of Concept and Deployment?
Are you currently involved in a project that revolves around fulfilling customer requirements? As part of your responsibilities, are you tasked with deploying a functional data science project?
I'm referring to the point at which you determine that the project is prepared for delivery. Is it sufficient to provide a functional model based on a script or notebook, accompanied by a presentation that includes relevant metrics? Or do you also engage in the deployment phase? I'm somewhat perplexed because there is often a request for a "proof of concept," but is functional code alone sufficient to satisfy this requirement?
I am a part of a small team and my team seldom deals with external clients, so I'm unsure about the boundaries between what should be accomplished before transitioning to a production-level stage.
submitted by /u/missing-in-idleness
[link] [comments]
Where to Draw the Line between Proof of Concept and Deployment?
Are you currently involved in a project that revolves around fulfilling customer requirements? As part of your responsibilities, are you tasked with deploying a functional data science project?
I'm referring to the point at which you determine that the project is prepared for delivery. Is it sufficient to provide a functional model based on a script or notebook, accompanied by a presentation that includes relevant metrics? Or do you also engage in the deployment phase? I'm somewhat perplexed because there is often a request for a "proof of concept," but is functional code alone sufficient to satisfy this requirement?
I am a part of a small team and my team seldom deals with external clients, so I'm unsure about the boundaries between what should be accomplished before transitioning to a production-level stage.
submitted by /u/missing-in-idleness
[link] [comments]
Reddit
[deleted by user] : r/datascience
1.5M subscribers in the datascience community. A space for data science professionals to engage in discussions and debates on the subject of data…
Data Science
Computer Science Courses + Certs
Hello All, I’m an engineer and I’d like to learn the basics of computer science well, probably like the first semester of an undergrad, to continue to self study AI and hopefully do a masters later.
What’s a good resource to do so? And is there a good certification course? These more complex topics, I learn better when I’ve put money on the line and there’s a little diploma at the end.
Thank you!
submitted by /u/ChelsMe
[link] [comments]
Computer Science Courses + Certs
Hello All, I’m an engineer and I’d like to learn the basics of computer science well, probably like the first semester of an undergrad, to continue to self study AI and hopefully do a masters later.
What’s a good resource to do so? And is there a good certification course? These more complex topics, I learn better when I’ve put money on the line and there’s a little diploma at the end.
Thank you!
submitted by /u/ChelsMe
[link] [comments]
Reddit
From the datascience community on Reddit
Explore this post and more from the datascience community
Data Science
Has anyone tried Cursor.sh AI editor for data science?
I've seen a few people talk cursor https://cursor.sh/ for software saying that it was good. Has anyone tried it for data science?
submitted by /u/soggypocket
[link] [comments]
Has anyone tried Cursor.sh AI editor for data science?
I've seen a few people talk cursor https://cursor.sh/ for software saying that it was good. Has anyone tried it for data science?
submitted by /u/soggypocket
[link] [comments]
Reddit
From the datascience community on Reddit: Has anyone tried Cursor.sh AI editor for data science?
Explore this post and more from the datascience community