Glossary of Data Science and Data Analytics

What is Data Mart?

DATA WAREHOUSE

Data Mart is a slice of the data warehouse logical model that serves a narrow group of users. Many data subsets only need a subset of data from the full tables in the data warehouse. For example, a subset of data can only have sales transactions, products, and inventory records. Many data subsets have only 5-20 tables instead of 4,000.

Data Mart Case Tables

The number of tables on a data mart has nothing to do with the size of the database. Master tables—called fact tables—can be hundreds of terabytes, consisting of call detail records, for example, for a telecommunications company. The Data Mart itself can be large, but only a small part of all the data contained in the data warehouse is selected.

Data Marts are usually denormalized by extracting data and aggregating it in a result table, often discarding detailed data. Some data logs are fully reloaded weekly or monthly; it is relatively easy to delete and refresh all data, so reports only look at transactions over the last 30 days.

Discover Teradata Vantage Solutions!

Data Mart and Star Chart

Data Marts and star scheme are inextricably linked. Consider rows and sequences of data in five spreadsheets. Four of the spreadsheets are connected via key fields that match the largest table, called a fact table. Imagine that there are 50 million records in the fact table, and this does not fit in a single spreadsheet, so they are recorded in data subset tables. Most data subsets have 5-10 tables in this star schema design, and small tables in star arms are called dimension tables.

Data Mart and Snowflake Diagram

Dimensions are small tables with important information. The fact table is where the bulk of the data, perhaps billions of records, is stored, and can be linked to the client table to retrieve the actual customer name and address fields. One variation, the snowflake diagram, has more than multi-fact tables connected by key fields. Each phenomenon table has four or five dimensional tables; a schematic set of tables and relationships looks like a snowflake—but it is still a subset of data.

What is the difference between Data Mart and Data Warehouse?

The difference between data mart and data warehouses is related to subject areas and integration, separated not by the size of the database, but by the complexity of the schema. Therefore, the questions that can be asked about the data warehouse are 100 times more complex than the questions to ask about data mart because all the data is in the data warehouse.
 
In the data warehouse there are a large number of “puzzle pieces”, all integrated tables are grouped by subject areas. The data warehouse doesn't have to be huge; it can only have five terabytes of storage space. Or, it could be hundreds of terabyte records. An alternative is to store three large tables in a single data march.

How to Upload and Move Data to Data Mart

Lots of great tools for data integration and lots of great relational databases for holding data — and there are dozens of excellent tools for analyzing data. Fortunately, moving data to business intelligence (BI) tools does not require labor intensive and a lot of data is not transferred. Small amounts of data are sent to the BI tool for display in reports or tables.

The real costs are in the transportation and transformation of data for other purposes. Extracting and converting data is costly and often slow. The integration phase is intensive in terms of business and computing — but the alternative is to provide business users with broken, incomplete, or inaccurate data. To be clear, the fastest way to get business users to leave a data warehouse or data mart is to give them dirty and missing data. If they can't trust the data, they go back to their spreadsheet. This explains why data in a data warehouse is so valuable — and why it is risky to spread data tags across a business when it comes to having the ability to fully master the real state of the business.

Data Mart Benefits

Data Marts allow for easier analysis without dealing with the complexity of data warehouses. In addition, data tags can be created faster, speeding up workflows and thus allowing us to easily access information. Due to its structure, which aims to provide summary data on a specific topic, it gives faster results and saves users performance and time.

Yapı Kredi - Data Warehouse Modernization Success Story
back to the Glossary

Discover Glossary of Data Science and Data Analytics

What is Latent Dirichlet Allocation (LDA)?

Latent Dirichlet Allocation (LDA) is a topic modeling technique that allows the discovery of hidden topic structures on large amounts of text data.

READ MORE
What is Data Science? What are Data Science Techniques?

Data science is about gaining actionable insights into the data a company owns, supported by a variety of applications.

READ MORE
Digital Citizenship (Dijital Vatandaşlık) Nedir?

Dijital vatandaşlık, bireylerin dijital dünyada (internet, sosyal medya, mobil cihazlar) etik, sorumlu ve güvenli bir şekilde davranmasını ifade eden bir kavramdır.

READ MORE
OUR TESTIMONIALS

Join Our Successful Partners!

We work with leading companies in the field of Turkey by developing more than 200 successful projects with more than 120 leading companies in the sector.
Take your place among our successful business partners.

CONTACT FORM

We can't wait to get to know you

Fill out the form so that our solution consultants can reach you as quickly as possible.

Grazie! Your submission has been received!
Oops! Something went wrong while submitting the form.
GET IN TOUCH
SUCCESS STORY

Mercanlar Cloud Data Warehouse Modernization

WATCH NOW
CHECK IT OUT NOW
Cookies are used on this website in order to improve the user experience and ensure the efficient operation of the website. “Accept” By clicking on the button, you agree to the use of these cookies. For detailed information on how we use, delete and block cookies, please Privacy Policy read the page.