Mr Rager

Leader of the Delinquents
Supporter
Joined
Feb 9, 2014
Messages
15,572
Reputation
5,650
Daps
69,883
Reppin
Mars
Where do ya'll see the Data Science field in 10+ years? Seems like now, "data science" is a conglomerate term for BI, data analytics, data visualization, and statistics when all of these are separate and distinct roles

Does your DS team actually provide insight and value to the company, or are you just displaying the data to the higher ups? I've seen plenty of "data science" presentations that don't tell us what we're actually supposed to get from the data...it's just a bunch of charts. Just something to keep in mind.
 

Regular Developer

Supporter
Joined
Jun 2, 2012
Messages
8,063
Reputation
1,786
Daps
22,736
Reppin
NJ
Where do ya'll see the Data Science field in 10+ years? Seems like now, "data science" is a conglomerate term for BI, data analytics, data visualization, and statistics when all of these are separate and distinct roles

Does your DS team actually provide insight and value to the company, or are you just displaying the data to the higher ups? I've seen plenty of "data science" presentations that don't tell us what we're actually supposed to get from the data...it's just a bunch of charts. Just something to keep in mind.
When I hear data science, I think its a subset of Business Intelligence. Data science, whenever I see a job description or an article, seems to be more related to statistics and Machine Learning.

I really like this idea of leveraging the cloud for most of this stuff. Snowflake seems to already be making a big impact as far as a managed data warehouse where there isn't a constant need to coordinate with the server team and DBAs. I'm thinking there'll be more tools like this where people just kind of package a bunch of cloud services together and then offer Saas. And I guess its not just limited to the BI space.

As mentioned in the programming thread, I just recently learned about grpc/http2 and how its bidirectional. I'm actually interested to see what happens to the real time streaming data tools. Since message brokering systems seem to be a popular way of handling this, I wonder if any of the popular ones are going to adjust.
 

Regular Developer

Supporter
Joined
Jun 2, 2012
Messages
8,063
Reputation
1,786
Daps
22,736
Reppin
NJ
Here are just a couple tools I've worked with or heard of in the BI space

Reporting/Data Viz:
QlikView/QlikSense (What I like about Qlik is that you can build the data model in qlik, rather than having to build it in the db), Tablaeu, PowerBI, Thoughtspot, Looker, Business Objects (Built on SAP, but from what I've heard from an SAP BI user, its pure trash, lol)

Data Pipeline - ETL/ELT:
Informatica, Attunity (partnered with Qlik recently), Talend, Alteryx, Wherescape

Data Pipeline - Streaming: (still getting ramped up on this space)
Streamsets, Kafka Connect (Directly interacts with Kafka Queing system)

DataWarehousing:
TimeXtender (This is kind of a combined ETL/Data Warehouse), Snowflake, AWS Redshift, Azure SQL Data Warehouse, (Or just any Relational DB, but structured in a denormalized fashion)

Task Scheduler:
Airflow, Luigi

Languages for Data Scientists:
Python and R

And if you want to mess around with these things without the overhead of always doing admin/devops/dataops work, You can either get familiar with Docker, which will allow you to spin up microservices that run in Hyper-V environments, or you can spin up server instances in AWS or Azure.

Edit: And here's a couple of books
The data warehouse developer toolkit - Pretty much just the methodology of building out a data warehouse. I think this one is Star-schema centric

Agile data warehouse design - It brings the agile sdlc methods and applies them to buidling out a data warehouse. I like it because instead of spec'ing out a warehouse completely based on the data, its more iterative and collaborative between all levels between the business and the operational DBAs
 
Last edited:

Rawtid

Veteran
Supporter
Joined
Jun 23, 2012
Messages
43,323
Reputation
14,608
Daps
119,417
Great thread! I’m still trying to narrow down my overall direction and which position I want. The skills are so varied, so job title is hard to go by and I don’t want to limit myself.

End of the month, I’ll have a cert in applied analytics from SAS. May 2020 I’ll have an MS in data analytics w/ a specialization in project management, going to sit for the exam shortly after and then get my Developer Cert for QlikSense. I think that’s a good combo of credentials that I can create opportunities with.

I thought of a Data Scientist but I’m not sure I want to get advanced in Python or R. I’ve been exposed to them and I don’t prefer them. However if I found a role using SAS, I would hop on it.
 

Warren Moon

Superstar
Supporter
Joined
Jun 1, 2014
Messages
8,656
Reputation
760
Daps
25,588
Where do ya'll see the Data Science field in 10+ years? Seems like now, "data science" is a conglomerate term for BI, data analytics, data visualization, and statistics when all of these are separate and distinct roles

Does your DS team actually provide insight and value to the company, or are you just displaying the data to the higher ups? I've seen plenty of "data science" presentations that don't tell us what we're actually supposed to get from the data...it's just a bunch of charts. Just something to keep in mind.

For me personally they do. But I would say I tell them distinctly about what I need. And then I keep digging further and further. A lot of business users don’t ask for deeper insights and therefore don’t get any real value
 

Secure Da Bag

Veteran
Joined
Dec 20, 2017
Messages
39,840
Reputation
20,309
Daps
125,738
I've been sleeping on this Data Warehousing thing. It's interesting but not as difficult as I thought it would be. Don't know a damn thing about filegroups though.
 
Top