由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Programming版 - 做Big data的前景如何?
相关主题
[求教]high dimensional vector visulization问个弱智问题,有网站用nosql做primary db么?
Waston, K computer AI 真的来到了吗?a vba question. please help
请问如何了解一个大project里程序的关系?你们觉得computer sicence最核心的是哪几门课?
后知后觉:visual studio也free了?重新问个小白问题。
怎样的构架才算Private Cloud?In memory computing technology
这里有做前台data visualization的牛人么Cost of a good computer
Cassandra 里的 partitionhelp on GAMS! thx!!
阿里系统崩溃,没人讨论下有知道machine learning, data mining 的同学吗?
相关话题的讨论汇总
话题: data话题: big话题: hadoop话题: 方向话题: learning
进入Programming版参与讨论
1 (共1页)
G***n
发帖数: 877
1
最近对big data比较感兴趣。个人分析了一下,主要是编程型的应用方向Cloud
computation - 包括parallel computation, Hadoop 和研究型的应用方向Data Mining
and Machine Learning - 包括Pattern recognition, Prediction. 还有
一些数据处理过程比如data cleaning, data visualization.
现在好像很多企业都需要做Data Scientist的人,跟Hadoop,.Net一起做Web service
。想往这个方向发展,感觉做研究型的应用方向应该越做越吃香,不容易被新技术淘汰
。但不知道这个方向能走多久,以后的薪酬如何。本人菜鸟级,对big data只略知一二
。各位大牛有何高见?
c****e
发帖数: 1453
2
The future is bright, BUT you have to get into the right position. Data
scientist is hot although the actual responsiblities vary a lot. You may end
up with doing some data cleanup and playing around to extract useful
features.
There are serveral pillars for big data paradigm: Cloud or say large
distributed processing infastructure, machine learning, visulization, etc.
Infastructure wise, hadoop(map-reduce) is a famous one. There are also lots
of works on real-time event processing, in memory DB and column based DB.
Lots of innovations happen in Paralell RDMS as well.
ML: Get a book to understand classifier, decision tree, SVM. If you know how
to use them, that's good enough. Deep learning is super hot, so you can
take a look. NLP is a plus in many JD.
Visulization: That's a big part. Many many startups working on this.
It's a complicated task to build big data pipeline in different companies.
The demand is very high and kept growing.If you have a strong combinations
of aforementioned areas, you are chased by recuiters. The pay should be
decent and you have more chance to get into star startups.
Without strong background, it's very hard to get a data scientist position.
n******t
发帖数: 4406
3
怕新技术的人,做什么都不会做很好。

Mining
service

【在 G***n 的大作中提到】
: 最近对big data比较感兴趣。个人分析了一下,主要是编程型的应用方向Cloud
: computation - 包括parallel computation, Hadoop 和研究型的应用方向Data Mining
: and Machine Learning - 包括Pattern recognition, Prediction. 还有
: 一些数据处理过程比如data cleaning, data visualization.
: 现在好像很多企业都需要做Data Scientist的人,跟Hadoop,.Net一起做Web service
: 。想往这个方向发展,感觉做研究型的应用方向应该越做越吃香,不容易被新技术淘汰
: 。但不知道这个方向能走多久,以后的薪酬如何。本人菜鸟级,对big data只略知一二
: 。各位大牛有何高见?

G***n
发帖数: 877
4
Thanks for the comments. I am more engaged in the application of machine
learning area. However, I am not good at memory staff or parallel
computation. I am thinking about the future within 20 to 30 years, but not
just 5 years from now. I guess the computational speed and memory issues
would not be any problem even after 10 years, so Hadoop may not be that hot
after 5 years. What do you think? What is the visualization do in the
startup?

end
lots
how

【在 c****e 的大作中提到】
: The future is bright, BUT you have to get into the right position. Data
: scientist is hot although the actual responsiblities vary a lot. You may end
: up with doing some data cleanup and playing around to extract useful
: features.
: There are serveral pillars for big data paradigm: Cloud or say large
: distributed processing infastructure, machine learning, visulization, etc.
: Infastructure wise, hadoop(map-reduce) is a famous one. There are also lots
: of works on real-time event processing, in memory DB and column based DB.
: Lots of innovations happen in Paralell RDMS as well.
: ML: Get a book to understand classifier, decision tree, SVM. If you know how

G***n
发帖数: 877
5
我说的新技术指的new language,我的意思是掌握一个基础方向,而不是语言,就不会
被cs方向淘汰。

【在 n******t 的大作中提到】
: 怕新技术的人,做什么都不会做很好。
:
: Mining
: service

S*******s
发帖数: 13043
6
这个词最近经常听到。它是一种技术,还是一种新产业?和数据挖掘有关系吗?
e*****t
发帖数: 1005
7
marketing word, just like cloud.

【在 S*******s 的大作中提到】
: 这个词最近经常听到。它是一种技术,还是一种新产业?和数据挖掘有关系吗?
T*****u
发帖数: 7103
8
ml方面需要编程吗
d********u
发帖数: 5383
9
需要,而且需要domain knowledge。
这就是某些理论大家做不出任何实际的东西的原因。

【在 T*****u 的大作中提到】
: ml方面需要编程吗
T*****u
发帖数: 7103
10
domain knowledge具体是什么啊,应用的阶段还是具体到numerical optimization的阶段

【在 d********u 的大作中提到】
: 需要,而且需要domain knowledge。
: 这就是某些理论大家做不出任何实际的东西的原因。

p**o
发帖数: 3409
11
比如你是搞金融的大数据分析,你要有金融的背景;
搞IP网络的大数据分析,要懂基本的TCP/IP的知识。
“numerical optimization”只是手段,除非你搞这方面研究,
不然跟domain knowledge是正交的。

阶段

【在 T*****u 的大作中提到】
: domain knowledge具体是什么啊,应用的阶段还是具体到numerical optimization的阶段
1 (共1页)
进入Programming版参与讨论
相关主题
有知道machine learning, data mining 的同学吗?怎样的构架才算Private Cloud?
问一个machine learning/SVM 问题这里有做前台data visualization的牛人么
请问Python初学者怎么学Cassandra 里的 partition
parsing bibliography and sorting (转载)阿里系统崩溃,没人讨论下
[求教]high dimensional vector visulization问个弱智问题,有网站用nosql做primary db么?
Waston, K computer AI 真的来到了吗?a vba question. please help
请问如何了解一个大project里程序的关系?你们觉得computer sicence最核心的是哪几门课?
后知后觉:visual studio也free了?重新问个小白问题。
相关话题的讨论汇总
话题: data话题: big话题: hadoop话题: 方向话题: learning