s*********e 发帖数: 1051 | 1 Which is better for data scientists / analysts?
Thanks |
p*****2 发帖数: 21240 | |
s*********e 发帖数: 1051 | 3 给展开说说吧,大牛。
另外,看了你前面的帖子,挺有帮助的。多谢。
【在 p*****2 的大作中提到】 : clojure
|
p*****2 发帖数: 21240 | 4
Scala对于data scientist太heavy了。而且LISP对数据的处理是出奇的好呀。
【在 s*********e 的大作中提到】 : 给展开说说吧,大牛。 : 另外,看了你前面的帖子,挺有帮助的。多谢。
|
c****e 发帖数: 1453 | 5 For data lifting, clojure is better. For backend infrastructure development,
scala is a better pick for most of the teams. |
s*********e 发帖数: 1051 | 6 Could you kindly explain why closure is better for data lifting in details?
Appreciate it!
development,
★ 发自iPhone App: ChineseWeb 8.2.2
【在 c****e 的大作中提到】 : For data lifting, clojure is better. For backend infrastructure development, : scala is a better pick for most of the teams.
|
n****1 发帖数: 1136 | 7 R/matlab has much better community in scientific computing&number crunching.
Even python has a lot of number crunching libraries. But I never heard any
friend doing scientific computation on JVM.
I'm not sure what is "data scientist", but clearly coding is just a tool for
you, instead of a job for others here. In this case you'd better stick to
the common practice in your field. Talk to your colleges, cz no one here
know your situation.
Just one question, is your computation mainly integer or floating point? JVM
's floating point is inaccurate, that is why no one in academia use it.
【在 s*********e 的大作中提到】 : Could you kindly explain why closure is better for data lifting in details? : Appreciate it! : : development, : ★ 发自iPhone App: ChineseWeb 8.2.2
|
c******o 发帖数: 1277 | 8 there is no problem to use either.
the key is what project you gonna do.
you should use clojure for prototype/fast project/not known project
for long term/big team/well understood project, i definitely say scala.
clojure has one way to do things and very clean. but that way is not the
best for everything.
scala is very complex if you want to use all powers. but it really the only
thing you can use like a Haskell and at the same time use like a clean power
Java. a good plan IMO can make scala code much more efficient/reusable/
safer than clojure.
for data science stack, look at spark from Berkeley AMPLab, Handoop,
Yahoo etc. I believe it will win over storm in long run. |
e*******o 发帖数: 4654 | 9 http://blog.fogus.me/2013/07/22/fp-vs-oo-from-the-trenches/
数据间没啥太复杂的关系,clojure简单些,上手还快。
不过两个都抵不过python吧,如果不是自己从头写新的算法。 |
s*********e 发帖数: 1051 | 10 既然您这么说,那就是CLOJURE了。
麻烦您推荐一本入门书吧。多谢了!
【在 p*****2 的大作中提到】 : : Scala对于data scientist太heavy了。而且LISP对数据的处理是出奇的好呀。
|
|
|
p*****2 发帖数: 21240 | 11
clojure in action, 不过快要出新版了
【在 s*********e 的大作中提到】 : 既然您这么说,那就是CLOJURE了。 : 麻烦您推荐一本入门书吧。多谢了!
|
c****t 发帖数: 19049 | 12 还折腾呐?
【在 s*********e 的大作中提到】 : Which is better for data scientists / analysts? : Thanks
|
c****t 发帖数: 19049 | 13 好像现在这一代都不知道JVM的floating不准。或传说已解决。
crunching.
any
for
JVM
【在 n****1 的大作中提到】 : R/matlab has much better community in scientific computing&number crunching. : Even python has a lot of number crunching libraries. But I never heard any : friend doing scientific computation on JVM. : I'm not sure what is "data scientist", but clearly coding is just a tool for : you, instead of a job for others here. In this case you'd better stick to : the common practice in your field. Talk to your colleges, cz no one here : know your situation. : Just one question, is your computation mainly integer or floating point? JVM : 's floating point is inaccurate, that is why no one in academia use it.
|
c****t 发帖数: 19049 | 14 都是理想。科学计算的基本模块从来没需要超越甚至FORTRAN77的能力。数据分析要么
用SAS,好歹SAS现在有IML了,不那么土了; 如果非要自己写算法就学C,用Rcpp和
Cython。
only
power
【在 c******o 的大作中提到】 : there is no problem to use either. : the key is what project you gonna do. : you should use clojure for prototype/fast project/not known project : for long term/big team/well understood project, i definitely say scala. : clojure has one way to do things and very clean. but that way is not the : best for everything. : scala is very complex if you want to use all powers. but it really the only : thing you can use like a Haskell and at the same time use like a clean power : Java. a good plan IMO can make scala code much more efficient/reusable/ : safer than clojure.
|
c*******9 发帖数: 9032 | 15 太专的东西局限也多,比如以后要支持并行。SAS之类是否容易做到。
【在 c****t 的大作中提到】 : 都是理想。科学计算的基本模块从来没需要超越甚至FORTRAN77的能力。数据分析要么 : 用SAS,好歹SAS现在有IML了,不那么土了; 如果非要自己写算法就学C,用Rcpp和 : Cython。 : : only : power
|
s*********e 发帖数: 1051 | 16 sas支持并行
【在 c*******9 的大作中提到】 : 太专的东西局限也多,比如以后要支持并行。SAS之类是否容易做到。
|
n****1 发帖数: 1136 | 17 我搜了下, 抱怨JVM floating point的文章现在还有一大堆.
这个真的是个deal breaker, 数据都算错的话语言再强大也没用.
【在 c****t 的大作中提到】 : 好像现在这一代都不知道JVM的floating不准。或传说已解决。 : : crunching. : any : for : JVM
|
d****i 发帖数: 4809 | 18 Java的floating point实现难道不是采用的IEEE 754标准?
【在 n****1 的大作中提到】 : 我搜了下, 抱怨JVM floating point的文章现在还有一大堆. : 这个真的是个deal breaker, 数据都算错的话语言再强大也没用.
|
l*********s 发帖数: 5409 | 19 what is the problem really? float type lacking precision is not a unique
problem of java I think
【在 n****1 的大作中提到】 : 我搜了下, 抱怨JVM floating point的文章现在还有一大堆. : 这个真的是个deal breaker, 数据都算错的话语言再强大也没用.
|
n****1 发帖数: 1136 | 20 只是subset of IEEE 754, 有些没支持的
还有就是python/matlab/R这些东西底层矩阵运算用的都是lapack那一套. 如果用scala
/closure,底层又不是这一套的话, 计算结果非常有可能和别人不一样.
【在 d****i 的大作中提到】 : Java的floating point实现难道不是采用的IEEE 754标准?
|
c******o 发帖数: 1277 | 21 需要做实际科学计算当然都是专用库,里面都是Arbitrary Precision Number,也不是
以浮点表达的,和在JVM上没啥关系 |