L*********y 发帖数: 1 | 1 https://www.smartrecruiters.com/Ancestry/88345635-senior-data-s
Company Description
Ancestry is the world's largest online resource for family history. We have
helped pioneer the market for online family history research, taking a
pursuit that was expensive and time-consuming and making it easy, affordable
and accessible to anyone with an interest in their family history. The
foundation of our service is an extensive collection of billions of
historical records that we have digitized, indexed and put online over the
past 17 years. These digital records and documents, combined with our
proprietary online search technologies, tools and collaboration features,
have enabled our more than two million subscribers to create over 13 billion
historical records, along with millions of DNA results to make meaningful
discoveries about the lives of their ancestors.
With over 1,400 employees around the world, we are known for our cutting-
edge technology, phenomenal innovation, and offer a compelling and rewarding
workplace where you will thrive. We seek out passionate people to join our
mission of helping people discover, preserve and share their family history.
We invite you to explore and discover the many opportunities that await you
at Ancestry.
Job Description
Data Mining Product team is looking for an experienced Data Scientists who
has a passion to build data products and data systems.
Key Responsibilities / Performance Requirements:
Understand existing business flow and website features, dive into the
underlying data, apply relevant Data Mining techniques and/or Machine
Learning algorithms and propose data analytic product to improve the website
intelligence
Implement the applicable Machine Learning or statistics based algorithm for
prediction and optimization and deliver the trained model to production
Create and implement algorithms in relevant statistical inference, graph and
network analysis, natural language processing with open source tools and
libraries.
Build and maintain code to populate HDFS, Hadoop with log from Kafka or data
loaded from SQL production systems.
Design, build and support algorithms of data transformation, conversion,
computation on Hadoop, Spark and other distributed Big Data Systems
Design and support effective storage and retrieval of Big Data
Qualifications
Required Skills:
Experience with Hadoop stack (HIVE, Pig, Hadoop streaming) and MapReduce
Expert of Data Mining, Machine Learning and related algorithms.
Experience in building Machine Learning based data products in production
Database experience with MySQL, MSSQL or equivalent
Experience with HBase or comparable NoSQL.
Proficient in two of the languages: Java, Python, Scala, C++ in Linux/Unix
Ph.D of Computer Science/Engineering or equivalent plus a minimum of 2-5
years relevant experience.
Desired:
Experience in Spark MLLib, Mahout
Familiarity out data formats and serialization, XML, JSON, AVRO, Thrift,
ProtoBuf
Experience with graph frameworks, such as Giraph, Hama, GraphLab, GraphX
Experience with R and/or MatLab
Strong communication skills
Read Tom White's "Hadoop: the Definitive Guide" and Jimmy Lin/Chris Dryer’s
“Data-Intensive Text Processing with MapReduce”
To apply:
https://www.smartrecruiters.com/Ancestry/88345635-senior-data-s
or send email to
[email protected]/* */ |
|