import React from 'react'; 
import {Link} from 'react-router-dom'; 
import useCustomEffect from '../../../useCustomEffect'; 
export default function Python2(){
useCustomEffect()
return ( <div>
<div className="jp-Cell jp-MarkdownCell jp-Notebook-cell">
<div className="jp-Cell-inputWrapper">
<div className="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div className="jp-InputArea jp-Cell-inputArea"><div className="jp-InputPrompt jp-InputArea-prompt">
</div><div className="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
<h3 id="Quick-Overview-of-a-DataFrame">Quick Overview of a DataFrame<a className="anchor-link" href="#Quick-Overview-of-a-DataFrame">¶</a></h3><p>After we've loaded data into a Pandas DataFrame, the first thing to do is to inspect the DataFrame's characteristics and structures.</p>
</div>
</div>
</div>
</div>
<div className="jp-Cell jp-MarkdownCell jp-Notebook-cell">
<div className="jp-Cell-inputWrapper">
<div className="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div className="jp-InputArea jp-Cell-inputArea"><div className="jp-InputPrompt jp-InputArea-prompt">
</div><div className="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
<p>This tutorial uses classic Iris dataset, which can be downloaded here <a href="https://3codeacademy.s3.amazonaws.com/dataset/python/Iris.csv" id="downloadData">Iris dataset</a>.</p>
</div>
</div>
</div>
</div><div className="jp-Cell jp-CodeCell jp-Notebook-cell jp-mod-noOutputs">
<div className="jp-Cell-inputWrapper">
<div className="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div className="jp-InputArea jp-Cell-inputArea">
<div className="jp-InputPrompt jp-InputArea-prompt">In [2]:</div>
<div className="jp-CodeMirrorEditor jp-Editor jp-InputArea-editor" data-type="inline">
<div className="CodeMirror cm-s-jupyter">
<div className="highlight hl-ipython3"><pre className='demo-highlight python'><code className='sourceCode'><span><span className="kn">import</span> <span className="nn">pandas</span> <span className="k">as</span> <span className="nn">pd</span></span>

<span><span className="n">df</span> <span className="o">=</span> <span className="n">pd</span><span className="o">.</span><span className="n">read_csv</span><span className="p">(</span><span className="s1">'Iris.csv'</span><span className="p">)</span></span>

</code></pre></div>
</div>
</div>
</div>
</div>
</div>
<div className="jp-Cell jp-MarkdownCell jp-Notebook-cell">
<div className="jp-Cell-inputWrapper">
<div className="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div className="jp-InputArea jp-Cell-inputArea"><div className="jp-InputPrompt jp-InputArea-prompt">
</div><div className="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
<h4 id="1.-DataFrame-Overview-with-df.head()">1. DataFrame Overview with <code>df.head()</code><a className="anchor-link" href="#1.-DataFrame-Overview-with-df.head()">¶</a></h4><p>In pandas, you can use <code>df.head(n)</code> function to view the first <code>n</code> rows of a DataFrame, providing a quick overview of your data. If <code>n</code> is not specified, the default value is 5. This function is handy for a rapid glimpse into your dataset.</p>
</div>
</div>
</div>
</div><div className="jp-Cell jp-CodeCell jp-Notebook-cell">
<div className="jp-Cell-inputWrapper">
<div className="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div className="jp-InputArea jp-Cell-inputArea">
<div className="jp-InputPrompt jp-InputArea-prompt">In [3]:</div>
<div className="jp-CodeMirrorEditor jp-Editor jp-InputArea-editor" data-type="inline">
<div className="CodeMirror cm-s-jupyter">
<div className="highlight hl-ipython3"><pre className='demo-highlight python'><code className='sourceCode'><span><span className="n">df</span><span className="o">.</span><span className="n">head</span><span className="p">()</span></span>

</code></pre></div>
</div>
</div>
</div>
</div>
<div className="jp-Cell-outputWrapper">
<div className="jp-Collapser jp-OutputCollapser jp-Cell-outputCollapser">
</div>
<div className="jp-OutputArea jp-Cell-outputArea">
<div className="jp-OutputArea-child">
<div></div><div className="jp-RenderedHTMLCommon jp-RenderedHTML jp-OutputArea-output jp-OutputArea-executeResult" data-mime-type="text/html">
<div>

<br /><table border="1" className="dataframe">
<thead>
<tr>
<th></th>
<th>Id</th>
<th>SepalLengthCm</th>
<th>SepalWidthCm</th>
<th>PetalLengthCm</th>
<th>PetalWidthCm</th>
<th>Species</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>1</td>
<td>5.1</td>
<td>3.5</td>
<td>1.4</td>
<td>0.2</td>
<td>Iris-setosa</td>
</tr>
<tr>
<th>1</th>
<td>2</td>
<td>4.9</td>
<td>3.0</td>
<td>1.4</td>
<td>0.2</td>
<td>Iris-setosa</td>
</tr>
<tr>
<th>2</th>
<td>3</td>
<td>4.7</td>
<td>3.2</td>
<td>1.3</td>
<td>0.2</td>
<td>Iris-setosa</td>
</tr>
<tr>
<th>3</th>
<td>4</td>
<td>4.6</td>
<td>3.1</td>
<td>1.5</td>
<td>0.2</td>
<td>Iris-setosa</td>
</tr>
<tr>
<th>4</th>
<td>5</td>
<td>5.0</td>
<td>3.6</td>
<td>1.4</td>
<td>0.2</td>
<td>Iris-setosa</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
</div>
<div className="jp-Cell jp-MarkdownCell jp-Notebook-cell">
<div className="jp-Cell-inputWrapper">
<div className="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div className="jp-InputArea jp-Cell-inputArea"><div className="jp-InputPrompt jp-InputArea-prompt">
</div><div className="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
<h4 id="2.-Inspect-Last-n-Rows-with-df.tail()">2. Inspect Last n Rows with <code>df.tail()</code><a className="anchor-link" href="#2.-Inspect-Last-n-Rows-with-df.tail()">¶</a></h4><p>Similar to <code>df.head()</code>, you can use <code>df.tail(n)</code> to check the last <code>n</code> rows of a DataFrame.</p>
</div>
</div>
</div>
</div><div className="jp-Cell jp-CodeCell jp-Notebook-cell">
<div className="jp-Cell-inputWrapper">
<div className="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div className="jp-InputArea jp-Cell-inputArea">
<div className="jp-InputPrompt jp-InputArea-prompt">In [3]:</div>
<div className="jp-CodeMirrorEditor jp-Editor jp-InputArea-editor" data-type="inline">
<div className="CodeMirror cm-s-jupyter">
<div className="highlight hl-ipython3"><pre className='demo-highlight python'><code className='sourceCode'><span><span className="n">df</span><span className="o">.</span><span className="n">tail</span><span className="p">(</span><span className="mi">5</span><span className="p">)</span></span>

</code></pre></div>
</div>
</div>
</div>
</div>
<div className="jp-Cell-outputWrapper">
<div className="jp-Collapser jp-OutputCollapser jp-Cell-outputCollapser">
</div>
<div className="jp-OutputArea jp-Cell-outputArea">
<div className="jp-OutputArea-child">
<div></div><div className="jp-RenderedHTMLCommon jp-RenderedHTML jp-OutputArea-output jp-OutputArea-executeResult" data-mime-type="text/html">
<div>

<br /><table border="1" className="dataframe">
<thead>
<tr>
<th></th>
<th>Id</th>
<th>SepalLengthCm</th>
<th>SepalWidthCm</th>
<th>PetalLengthCm</th>
<th>PetalWidthCm</th>
<th>Species</th>
</tr>
</thead>
<tbody>
<tr>
<th>145</th>
<td>146</td>
<td>6.7</td>
<td>3.0</td>
<td>5.2</td>
<td>2.3</td>
<td>Iris-virginica</td>
</tr>
<tr>
<th>146</th>
<td>147</td>
<td>6.3</td>
<td>2.5</td>
<td>5.0</td>
<td>1.9</td>
<td>Iris-virginica</td>
</tr>
<tr>
<th>147</th>
<td>148</td>
<td>6.5</td>
<td>3.0</td>
<td>5.2</td>
<td>2.0</td>
<td>Iris-virginica</td>
</tr>
<tr>
<th>148</th>
<td>149</td>
<td>6.2</td>
<td>3.4</td>
<td>5.4</td>
<td>2.3</td>
<td>Iris-virginica</td>
</tr>
<tr>
<th>149</th>
<td>150</td>
<td>5.9</td>
<td>3.0</td>
<td>5.1</td>
<td>1.8</td>
<td>Iris-virginica</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
</div>
<div className="jp-Cell jp-MarkdownCell jp-Notebook-cell">
<div className="jp-Cell-inputWrapper">
<div className="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div className="jp-InputArea jp-Cell-inputArea"><div className="jp-InputPrompt jp-InputArea-prompt">
</div><div className="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
<h4 id="3.-Get-Dimensionality-of-a-DataFrame">3. Get Dimensionality of a DataFrame<a className="anchor-link" href="#3.-Get-Dimensionality-of-a-DataFrame">¶</a></h4><p>If we want to have a quick look at how many rows and columns a DataFrame has, we can use the <code>shape</code> attribute of a DataFrame. It returns a tuple in the format <code>(rows, columns)</code> that provides the total row and column count for the DataFrame. As shown below, the Iris data has 150 rows of records, and 6 columns.</p>
</div>
</div>
</div>
</div><div className="jp-Cell jp-CodeCell jp-Notebook-cell">
<div className="jp-Cell-inputWrapper">
<div className="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div className="jp-InputArea jp-Cell-inputArea">
<div className="jp-InputPrompt jp-InputArea-prompt">In [5]:</div>
<div className="jp-CodeMirrorEditor jp-Editor jp-InputArea-editor" data-type="inline">
<div className="CodeMirror cm-s-jupyter">
<div className="highlight hl-ipython3"><pre className='demo-highlight python'><code className='sourceCode'><span><span className="n">df</span><span className="o">.</span><span className="n">shape</span></span>

</code></pre></div>
</div>
</div>
</div>
</div>
<div className="jp-Cell-outputWrapper">
<div className="jp-Collapser jp-OutputCollapser jp-Cell-outputCollapser">
</div>
<div className="jp-OutputArea jp-Cell-outputArea">
<div className="jp-OutputArea-child">
<div></div><div className="jp-RenderedText jp-OutputArea-output jp-OutputArea-executeResult" data-mime-type="text/plain">
<pre className='demo-highlight python'><code className='sourceCode'><span>(150, 6)</span></code></pre></div>
</div>
</div>
</div>
</div>
<div className="jp-Cell jp-MarkdownCell jp-Notebook-cell">
<div className="jp-Cell-inputWrapper">
<div className="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div className="jp-InputArea jp-Cell-inputArea"><div className="jp-InputPrompt jp-InputArea-prompt">
</div><div className="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
<h4 id="4.-DataFrame-Summary-with-info()">4. DataFrame Summary with <code>info()</code><a className="anchor-link" href="#4.-DataFrame-Summary-with-info()">¶</a></h4><p>The <code>info()</code> method is a handy tool for getting a concise summary of a DataFrame. This summary provides information about the number of rows in our dataset, the count of missing values in each column, as well as the data type of each column. <br/><br/>
You can easily identify missing data by comparing the <code>Non-Null Count</code> to the total record count mentioned in the <code>RangeIndex</code> section at the top of the summary.</p>
</div>
</div>
</div>
</div><div className="jp-Cell jp-CodeCell jp-Notebook-cell">
<div className="jp-Cell-inputWrapper">
<div className="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div className="jp-InputArea jp-Cell-inputArea">
<div className="jp-InputPrompt jp-InputArea-prompt">In [8]:</div>
<div className="jp-CodeMirrorEditor jp-Editor jp-InputArea-editor" data-type="inline">
<div className="CodeMirror cm-s-jupyter">
<div className="highlight hl-ipython3"><pre className='demo-highlight python'><code className='sourceCode'><span><span className="n">df</span><span className="o">.</span><span className="n">info</span><span className="p">()</span></span>

</code></pre></div>
</div>
</div>
</div>
</div>
<div className="jp-Cell-outputWrapper">
<div className="jp-Collapser jp-OutputCollapser jp-Cell-outputCollapser">
</div>
<div className="jp-OutputArea jp-Cell-outputArea">
<div className="jp-OutputArea-child">
<div className="jp-OutputPrompt jp-OutputArea-prompt"></div>
<div className="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain">
<pre className='demo-highlight python'><code className='sourceCode'>&lt;class 'pandas.core.frame.DataFrame'&gt;
<br />RangeIndex: 150 entries, 0 to 149
<br />Data columns (total 6 columns):
<br /> #   Column         Non-Null Count  Dtype  
<br />---  ------         --------------  -----  
<br /> 0   Id             150 non-null    int64  
<br /> 1   SepalLengthCm  150 non-null    float64
<br /> 2   SepalWidthCm   150 non-null    float64
<br /> 3   PetalLengthCm  150 non-null    float64
<br /> 4   PetalWidthCm   150 non-null    float64
<br /> 5   Species        150 non-null    object 
<br />dtypes: float64(4), int64(1), object(1)
<br />memory usage: 7.2+ KB
</code></pre>
</div>
</div>
</div>
</div>
</div>
<div className="jp-Cell jp-MarkdownCell jp-Notebook-cell">
<div className="jp-Cell-inputWrapper">
<div className="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div className="jp-InputArea jp-Cell-inputArea"><div className="jp-InputPrompt jp-InputArea-prompt">
</div><div className="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
<h4 id="5.-Numerical-Feature-Statistics-with-describe()">5. Numerical Feature Statistics with <code>describe()</code><a className="anchor-link" href="#5.-Numerical-Feature-Statistics-with-describe()">¶</a></h4><p>The <code>describe()</code> function is for analyzing numerical data withinin a DataFrame, excluding categorical data. It provides essential statistics such as the mean, median, mode, minimum, and maximum values for each column. In our example, it computes these statistics for the first 5 columns in the DataFrame, excluding the <strong>Species</strong> column due to its non-numerical nature.<br/> <br/>
This summary helps us quickly understand value variations and identify data skew in the columns. While it provides statistics for all numerical columns, it may not provide meaningful insights for specific columns, such as the <strong>Id</strong> column.</p>
</div>
</div>
</div>
</div><div className="jp-Cell jp-CodeCell jp-Notebook-cell">
<div className="jp-Cell-inputWrapper">
<div className="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div className="jp-InputArea jp-Cell-inputArea">
<div className="jp-InputPrompt jp-InputArea-prompt">In [7]:</div>
<div className="jp-CodeMirrorEditor jp-Editor jp-InputArea-editor" data-type="inline">
<div className="CodeMirror cm-s-jupyter">
<div className="highlight hl-ipython3"><pre className='demo-highlight python'><code className='sourceCode'><span><span className="n">df</span><span className="o">.</span><span className="n">describe</span><span className="p">()</span></span>

</code></pre></div>
</div>
</div>
</div>
</div>
<div className="jp-Cell-outputWrapper">
<div className="jp-Collapser jp-OutputCollapser jp-Cell-outputCollapser">
</div>
<div className="jp-OutputArea jp-Cell-outputArea">
<div className="jp-OutputArea-child">
<div></div><div className="jp-RenderedHTMLCommon jp-RenderedHTML jp-OutputArea-output jp-OutputArea-executeResult" data-mime-type="text/html">
<div>

<br /><table border="1" className="dataframe">
<thead>
<tr>
<th></th>
<th>Id</th>
<th>SepalLengthCm</th>
<th>SepalWidthCm</th>
<th>PetalLengthCm</th>
<th>PetalWidthCm</th>
</tr>
</thead>
<tbody>
<tr>
<th>count</th>
<td>150.000000</td>
<td>150.000000</td>
<td>150.000000</td>
<td>150.000000</td>
<td>150.000000</td>
</tr>
<tr>
<th>mean</th>
<td>75.500000</td>
<td>5.843333</td>
<td>3.054000</td>
<td>3.758667</td>
<td>1.198667</td>
</tr>
<tr>
<th>std</th>
<td>43.445368</td>
<td>0.828066</td>
<td>0.433594</td>
<td>1.764420</td>
<td>0.763161</td>
</tr>
<tr>
<th>min</th>
<td>1.000000</td>
<td>4.300000</td>
<td>2.000000</td>
<td>1.000000</td>
<td>0.100000</td>
</tr>
<tr>
<th>25%</th>
<td>38.250000</td>
<td>5.100000</td>
<td>2.800000</td>
<td>1.600000</td>
<td>0.300000</td>
</tr>
<tr>
<th>50%</th>
<td>75.500000</td>
<td>5.800000</td>
<td>3.000000</td>
<td>4.350000</td>
<td>1.300000</td>
</tr>
<tr>
<th>75%</th>
<td>112.750000</td>
<td>6.400000</td>
<td>3.300000</td>
<td>5.100000</td>
<td>1.800000</td>
</tr>
<tr>
<th>max</th>
<td>150.000000</td>
<td>7.900000</td>
<td>4.400000</td>
<td>6.900000</td>
<td>2.500000</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
</div>
<div className="jp-Cell jp-MarkdownCell jp-Notebook-cell">
<div className="jp-Cell-inputWrapper">
<div className="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div className="jp-InputArea jp-Cell-inputArea"><div className="jp-InputPrompt jp-InputArea-prompt">
</div><div className="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
<p>Great! Now that we've learned how to get a quick glimpse of the data, let's move on to the method for selecting a specific subset of the dataset.</p>
</div>
</div>
</div>
</div>
</div>
)}