Posts

Showing posts from October, 2021

What are the SQL constraints?

What are the SQL constraints? Ans: SQL constraints  are used to specify rules for the data in a table.  Constraints  are used to limit the type of data that can go into a table   1.      NOT NULL Constraint  − Ensures that a column cannot have NULL value. 2.      DEFAULT Constraint  − Provides a default value for a column when none is specified. 3.      UNIQUE Constraint  − Ensures that all values in a column are different. 4.      PRIMARY Key  − Uniquely identifies each row/record in a database table. 5.      FOREIGN Key  − Uniquely identifies a row/record in any of the given database table. 6.      CHECK Constraint  − The CHECK constraint ensures that all the values in a column satisfies certain conditions. 7.      INDEX  − Used to create and retrieve data from the database very qu...

Key Differences Between Primary key and Unique key

  Key Differences Between Primary key and Unique key 1.     When an attribute is declared as a primary key, it will not accept NULL values. On the other hand, when an attribute is declared as Unique it can accept one NULL value. 2.     A table can have only a primary key whereas there can be multiple unique constraints on a table. 3.     A Clustered index is automatically created when a primary key is defined. In contrast, the Unique key generates the non-clustered index.  

Key Differences Between Fact Table and Dimension Table

  Key Differences Between Fact Table and Dimension Table 1.     Fact table contains measurement along the dimension/attributes of a dimension table. 2.     Fact table contains more records and fewer attributes as compared to dimension table whereas, dimension table contain more attributes and fewer records. 3.     The table size of fact table grows vertically whereas, table size of dimension table grows horizontally. 4.     Each dimension table contains a primary key to identify each record in the table whereas, fact table contains concatenated key which is a combination of all primary keys of all dimension table. 5.     Dimension table has to be recorded before the creation of fact table. 6.     A Schema contains fewer fact tables but more dimension tables. 7.     Attributes in fact table are numeric as well as textual, but attributes of dimension table have textu...

Differences Between Data Warehouse and Data Mart

  Key Differences Between Data Warehouse and Data Mart 1.     Data warehouse is application-independent whereas data mart is specific to decision support system application. 2.     The data is stored in a single,  centralized  repository in a data warehouse. As against, data mart stores data  decentrally  in the user area. 3.     Data warehouse contains a  detailed  form of data. In contrast, a data mart contains  summarized  and selected data. 4.     The data in a data warehouse is  slightly  denormalized while in the case of Datamart it is  highly  denormalized. 5.     The construction of a data warehouse involves a  top-down  approach. Conversely, while constructing a data mart the  bottom-up  approach is used. 6.     Data warehouse is  flexible ,  information-oriented,  and longtime e...

Differences Between Star and Snowflake Schema

  Key Differences Between Star and Snowflake Schema 1.     Star schema contains just  one  dimension table for one dimension entry while there may exist dimension and sub-dimension table for one entry. 2.     Normalization is used in snowflake schema which eliminates the data redundancy. As against, normalization is not performed in star schema which results in data redundancy. 3.     Star schema is simple, easy to understand and involves less intricate queries. On the contrary, snowflake schema is hard to understand and involves complex queries. 4.     The data model approach used in a star schema is top-down whereas snowflake schema uses bottom-up. 5.     Star schema uses a fewer number of joins. On the other hand, snowflake schema uses a large number of joins. 6.     The space consumed by star schema is more as compared to snowflake schema. 7.     The ...

Normalization vs Denormalization

  Key Differences Between Normalization and Denormalization  1- Normalization is the technique of dividing the data into multiple tables to reduce data redundancy and inconsistency and to achieve data integrity. On the other hand, Denormalization is the technique of combining the data into a single table to make data retrieval faster. 2.    Normalization is used in OLTP system, which emphasizes on making the insert, delete and update anomalies faster. As against, Denormalization is used in OLAP system, which emphasizes on making the search and analysis faster. 3.    Data integrity is maintained in the normalization process while in denormalization data integrity harder to retain. 4.    Redundant data is eliminated when normalization is performed whereas denormalization increases the redundant data. 5.    Normalization increases the number of tables and joins. In contrast, denormalization reduces the number of tables and join. 6.  ...

Univariate,Bivariate and MultiVariate Analysis by EDA

 # Data science life cycle: Every Data science Beginner, working professional, student or practitioner follows a few steps while doing. I will tell you about all these steps in simple terms for your understanding. # 1.Hypothesis definition: - A proposed explanation as a starting point for further investigation. Ex:- A(company) wants to release a Raincoat(product) in Summer. now the company is in a dilemma whether to release the product or not. (i know its a bad idea, but for understanding, let's think this.) # 2. Data Acquisition: - collecting the required data. Ex:- collecting the last 10 years of data in a certain region. # 3.Exploratory Data Analysis(EDA):-     Analysing collected data using some concepts(will see them below). Ex: on collected data(existing data)data scientists will perform some analysis and decide, what are features/metrics to consider for model building. # 4.Model building:- This is where Machine learning comes into light. #Ex:- by using metrics(out...

Multiple_linear_regression

 # What is multiple_linear_regression * Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. * Multiple regression is an extension of linear (OLS) regression that uses just one explanatory variable. * MLR is used extensively in econometrics and financial inference. * Regression models are used to describe relationships between variables by fitting a line to the observed data. Regression allows you to estimate how the dependent variable changes as the independent variable(s) change. * Multiple linear regression is used to estimate the relationship between two or more independent variables and one dependent variable. You can use multiple linear regression when you want to know: * How strong the relationship is between two or more independent variables and one dependent variable (e.g. how rainfall, temperature, and amount of fertilizer added affect c...

Multi-Collinearity in Machine Learning

 # What is multicollinearity? >Multicollinearity occurs when independent variables in a regression model are correlated. This correlation is a problem because independent variables should be independent. If the degree of correlation between variables is high enough, it can cause problems when you fit the model and interpret the results. >Multicollinearity occurs when two or more independent variables are highly correlated with one another in a regression model. This means that an independent variable can be predicted from another independent variable in a regression model # Why Multi-Collinearity is a problem? When independent variables are highly correlated, change in one variable would cause change to another and so the model results fluctuate significantly. The model results will be unstable and vary a lot given a small change in the data or model. This will create the following problems: 1>It would be hard for you to choose the list of significant variables for the mod...

Linear Regression Indepth

 # What is a Regression * In Regression, we plot a graph between the variables which best fit the given data points. The machine learning model can deliver predictions regarding the data.  * In naïve words, “Regression shows a line or curve that passes through all the data points on a target-predictor graph in such a way that the vertical distance between the data points and the regression line is minimum.”  # Types of Regression models * Linear Regression * Polynomial Regression * Logistics Regression # Linear Regression in Machine Learning: * Linear regression is one of the easiest and most popular Machine Learning algorithms. It is a statistical method that is used for predictive analysis. Linear regression makes predictions for continuous/real or numeric variables such as sales, salary, age, product price, etc. * Linear regression algorithm shows a linear relationship between a dependent (y) and one or more independent (y) variables, hence called as linear regression....

Constructors VS Destructors in Python

Difference between Constructors VS Destructors  in Python What is Constructors in Python? ANS: #Constructors  are generally used for instantiating an object. #The  task of constructors is to initialize(assign values) to the data members of the class #when  an object of class is created. #In  Python the __init__() method is called the constructor and is always called when an object is created. #Syntax  of constructor declaration : #def  __init__(self):   # body of the constructor #Types  of constructors : #1>default constructor : #The  default constructor is simple constructor which doesn’t accept any arguments.It’s definition has only one argument #which  is a reference to the instance being constructed. #2>parameterized constructor : #constructor  with parameters is known as parameterized constructor. #The  parameterized constructor take its first argument as a reference to the instance being constructed known ...