l******o 发帖数: 52 | 1 Or data cleansing, data quality control etc.
Gartner 去年底发表过一个Dara Quality Tools Magic Quadrant 的 report, 对相关
Vendor做了些总结。我不很了解这些Vendor 的选择是否靠谱,但他们对于数据质量控
制的总结还很到位。在数据被大量收集的今天,强调数据清理和数据质量控制,尤为必
要。请记住,"Garbage in, garbage out".
这个Report originally available from http://www.gartner.com/technology/reprints.do?id=1-1LCD5XL&ct=131007&st=sb,
But not any more. 我这里摘一点,同时附上他们现在付费网址,供大家参考,也帮他
们做下广告。 | l******o 发帖数: 52 | 2 Magic Quadrant for Data Quality Tools
gartner.comOctober 7
Data quality assurance is a discipline focused on ensuring that data is fit
for use in business processes ranging from core operations to analytics and
decision-making, regulatory compliance, and engagement and interaction with
external entities.
As a discipline, it comprises much more than technology — it also includes
roles and organizational structures, processes for monitoring, measuring,
reporting and remediating data quality issues, and links to broader
information governance activities via data-quality-specific policies.
Given the scale and complexity of the data landscape across organizations of
all sizes and in all industries, tools to help automate key elements of the
discipline continue to attract more interest and to grow in value. As such,
the data quality tools market continues to show substantial growth, while
exhibiting innovation and change.
The data quality tools market includes vendors that offer stand-alone
software products to address the core functional requirements of the
discipline, which are:
Data profiling and data quality measurement: The analysis of data to capture
statistics (metadata) that provide insight into the quality of data and
help to identify data quality issues.
Parsing and standardization: The decomposition of text fields into component
parts and the formatting of values into consistent layouts based on
industry standards, local standards (for example, postal authority standards
for address data), user-defined business rules, and knowledge bases of
values and patterns.
Generalized "cleansing": The modification of data values to meet domain
restrictions, integrity constraints or other business rules that define when
the quality of data is sufficient for an organization.
Matching: Identifying, linking or merging related entries within or across
sets of data.
Monitoring: Deploying controls to ensure that data continues to conform to
business rules that define data quality for the organization.
Enrichment: Enhancing the value of internally-held data by appending related
attributes from external sources (for example, consumer demographic
attributes and geographic descriptors).
In addition, data quality tools provide a range of related functional
abilities that are not unique to this market but that are required to
execute many of the core functions of data quality, or for specific data
quality applications:
Connectivity/adapters: The ability to interact with a range of different
data structure types.
Subject-area-specific support: Standardization capabilities for specific
data subject areas.
International support: The ability to offer relevant data quality operations
on a global basis (such as handling data in multiple languages and writing
systems).
Metadata management: The ability to capture, reconcile and interoperate
metadata related to the data quality process.
Configuration environment: Capabilities for creating, managing and deploying
data quality rules.
Operations and administration: Facilities for supporting, managing and
controlling data quality processes.
Workflow/data quality process support: Processes and user interfaces for
various data quality roles, such as data stewards.
Service enablement: Service-oriented characteristics and support for service
-oriented architecture (SOA) deployments.
The tools provided by vendors in this market are generally consumed by end-
user organizations for internal deployment in their IT infrastructure — to
directly support transactional processes that require data quality
operations and to enable staff in data-quality-oriented roles (such as data
stewards) to engage in data quality improvement work. Off-premises solutions
in the form of hosted data quality offerings, SaaS delivery models and
cloud services continue to evolve and grow in popularity.
Return to Top
For vendors to be included in the Magic Quadrant, they must meet the
following criteria:
They must offer stand-alone packaged software tools or cloud-based services
(not only embedded in, or dependent on, other products
【在 l******o 的大作中提到】 : Or data cleansing, data quality control etc. : Gartner 去年底发表过一个Dara Quality Tools Magic Quadrant 的 report, 对相关 : Vendor做了些总结。我不很了解这些Vendor 的选择是否靠谱,但他们对于数据质量控 : 制的总结还很到位。在数据被大量收集的今天,强调数据清理和数据质量控制,尤为必 : 要。请记住,"Garbage in, garbage out". : 这个Report originally available from http://www.gartner.com/technology/reprints.do?id=1-1LCD5XL&ct=131007&st=sb, : But not any more. 我这里摘一点,同时附上他们现在付费网址,供大家参考,也帮他 : 们做下广告。
| l******o 发帖数: 52 | 3 付费link: http://gtnr.it/1tdIeVw
fit
and
with
includes
【在 l******o 的大作中提到】 : Magic Quadrant for Data Quality Tools : gartner.comOctober 7 : Data quality assurance is a discipline focused on ensuring that data is fit : for use in business processes ranging from core operations to analytics and : decision-making, regulatory compliance, and engagement and interaction with : external entities. : As a discipline, it comprises much more than technology — it also includes : roles and organizational structures, processes for monitoring, measuring, : reporting and remediating data quality issues, and links to broader : information governance activities via data-quality-specific policies.
| l******o 发帖数: 52 | 4 Or data cleansing, data quality control etc.
Gartner 去年底发表过一个Dara Quality Tools Magic Quadrant 的 report, 对相关
Vendor做了些总结。我不很了解这些Vendor 的选择是否靠谱,但他们对于数据质量控
制的总结还很到位。在数据被大量收集的今天,强调数据清理和数据质量控制,尤为必
要。请记住,"Garbage in, garbage out".
这个Report originally available from http://www.gartner.com/technology/reprints.do?id=1-1LCD5XL&ct=131007&st=sb,
But not any more. 我这里摘一点,同时附上他们现在付费网址,供大家参考,也帮他
们做下广告。 | l******o 发帖数: 52 | 5 Magic Quadrant for Data Quality Tools
gartner.comOctober 7
Data quality assurance is a discipline focused on ensuring that data is fit
for use in business processes ranging from core operations to analytics and
decision-making, regulatory compliance, and engagement and interaction with
external entities.
As a discipline, it comprises much more than technology — it also includes
roles and organizational structures, processes for monitoring, measuring,
reporting and remediating data quality issues, and links to broader
information governance activities via data-quality-specific policies.
Given the scale and complexity of the data landscape across organizations of
all sizes and in all industries, tools to help automate key elements of the
discipline continue to attract more interest and to grow in value. As such,
the data quality tools market continues to show substantial growth, while
exhibiting innovation and change.
The data quality tools market includes vendors that offer stand-alone
software products to address the core functional requirements of the
discipline, which are:
Data profiling and data quality measurement: The analysis of data to capture
statistics (metadata) that provide insight into the quality of data and
help to identify data quality issues.
Parsing and standardization: The decomposition of text fields into component
parts and the formatting of values into consistent layouts based on
industry standards, local standards (for example, postal authority standards
for address data), user-defined business rules, and knowledge bases of
values and patterns.
Generalized "cleansing": The modification of data values to meet domain
restrictions, integrity constraints or other business rules that define when
the quality of data is sufficient for an organization.
Matching: Identifying, linking or merging related entries within or across
sets of data.
Monitoring: Deploying controls to ensure that data continues to conform to
business rules that define data quality for the organization.
Enrichment: Enhancing the value of internally-held data by appending related
attributes from external sources (for example, consumer demographic
attributes and geographic descriptors).
In addition, data quality tools provide a range of related functional
abilities that are not unique to this market but that are required to
execute many of the core functions of data quality, or for specific data
quality applications:
Connectivity/adapters: The ability to interact with a range of different
data structure types.
Subject-area-specific support: Standardization capabilities for specific
data subject areas.
International support: The ability to offer relevant data quality operations
on a global basis (such as handling data in multiple languages and writing
systems).
Metadata management: The ability to capture, reconcile and interoperate
metadata related to the data quality process.
Configuration environment: Capabilities for creating, managing and deploying
data quality rules.
Operations and administration: Facilities for supporting, managing and
controlling data quality processes.
Workflow/data quality process support: Processes and user interfaces for
various data quality roles, such as data stewards.
Service enablement: Service-oriented characteristics and support for service
-oriented architecture (SOA) deployments.
The tools provided by vendors in this market are generally consumed by end-
user organizations for internal deployment in their IT infrastructure — to
directly support transactional processes that require data quality
operations and to enable staff in data-quality-oriented roles (such as data
stewards) to engage in data quality improvement work. Off-premises solutions
in the form of hosted data quality offerings, SaaS delivery models and
cloud services continue to evolve and grow in popularity.
Return to Top
For vendors to be included in the Magic Quadrant, they must meet the
following criteria:
They must offer stand-alone packaged software tools or cloud-based services
(not only embedded in, or dependent on, other products
【在 l******o 的大作中提到】 : Or data cleansing, data quality control etc. : Gartner 去年底发表过一个Dara Quality Tools Magic Quadrant 的 report, 对相关 : Vendor做了些总结。我不很了解这些Vendor 的选择是否靠谱,但他们对于数据质量控 : 制的总结还很到位。在数据被大量收集的今天,强调数据清理和数据质量控制,尤为必 : 要。请记住,"Garbage in, garbage out". : 这个Report originally available from http://www.gartner.com/technology/reprints.do?id=1-1LCD5XL&ct=131007&st=sb, : But not any more. 我这里摘一点,同时附上他们现在付费网址,供大家参考,也帮他 : 们做下广告。
| l******o 发帖数: 52 | 6 付费link: http://gtnr.it/1tdIeVw
fit
and
with
includes
【在 l******o 的大作中提到】 : Magic Quadrant for Data Quality Tools : gartner.comOctober 7 : Data quality assurance is a discipline focused on ensuring that data is fit : for use in business processes ranging from core operations to analytics and : decision-making, regulatory compliance, and engagement and interaction with : external entities. : As a discipline, it comprises much more than technology — it also includes : roles and organizational structures, processes for monitoring, measuring, : reporting and remediating data quality issues, and links to broader : information governance activities via data-quality-specific policies.
| g********s 发帖数: 3652 | 7 data warehouse 里的第一步就是ETL extract transform, load 就是除了clean data. |
|