Define Facts of Data
Data Science is focused on fact of Data in Foundation of Data science, It can be Processing of complex datasets, of the integrated dataset. Building predictive models from those data. The dataset's are classified into
- Upstream process
- Downstream process
Upstream process
Downstream process
There are many facts of data science
- Identifying the structure of data Cleaning, Filtering, Re-organising, Augmenting and aggregating data.
- It can be Visualizing data.
- Data analysis ,Statistic's and modelling , Machine learning.
- Assembling data processing pipelines to link these steps
- Leveraging high end computational resources for large scale problems.
Main categories of Data
- Structured
- Unstructured
- Natural Language
- Machine - generated
- Graph-based
- Audio ,Video and Images
- Streaming
Structured Data
Hierarchical data such as family tree is also called Structured data as they are stored in a particular structure.
Example
Unstructured Data
Unstructured data is data that is not easy to fit into a data model because the content is content-specific or varying
Natural Language
It is a special type of unstructured data , It is challenge to process because it required knowledge of specific data science techniques and linguistics
Example
Alexa, Apple Siri, Chat GPT
Machine generated Data
Machine generated data is the information that's automatically created by a computer , process, application or other machine without Human intervention.
Example
Random number generator, OTP generator etc.
Graph based Data
Graph is a mathematical structure to model pair wise relationship between objects. Graph focus on relationship or adjacency of objects.
Audio, Image and Video Data
Video, image and audio are data type that pose specific challenge to a data scientist. Task such as recognizing objects in pictures are more challenging for computers
Post a Comment