import React from 'react'; 
import {Link} from 'react-router-dom'; 
import {useSparkCustomEffect} from '../../useCustomEffect'; 
export default function PythonOutput(){
useSparkCustomEffect()
return ( <div>
<div class="jp-Cell jp-MarkdownCell jp-Notebook-cell">
<div class="jp-Cell-inputWrapper">
<div class="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div class="jp-InputArea jp-Cell-inputArea"><div class="jp-InputPrompt jp-InputArea-prompt">
</div><div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
<h1 id="Quick-Intro-to-Spark-Session">Quick Intro to Spark Session<a class="anchor-link" href="#Quick-Intro-to-Spark-Session">¶</a></h1><p>Spark Session is a key component in Apache Spark, serving as the main entry point for interacting with Spark's functionality. To use PySpark in Python environment, the first thing to do is to create a Spark Session.</p>
</div>
</div>
</div>
</div>
<div class="jp-Cell jp-MarkdownCell jp-Notebook-cell">
<div class="jp-Cell-inputWrapper">
<div class="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div class="jp-InputArea jp-Cell-inputArea"><div class="jp-InputPrompt jp-InputArea-prompt">
</div><div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
<h3 id="Create-a-Spark-Session">Create a Spark Session<a class="anchor-link" href="#Create-a-Spark-Session">¶</a></h3><p><strong>Key Methods</strong><br/></p>
<ul>
<li><code>builder</code>: Constructs a Spark Session.</li>
<li><code>appName()</code>: Sets the name of the application.</li>
<li><code>getOrCreate()</code>: Returns an existing Spark Session or creates a new one if none exists.</li>
</ul>
</div>
</div>
</div>
</div><div class="jp-Cell jp-CodeCell jp-Notebook-cell jp-mod-noOutputs">
<div class="jp-Cell-inputWrapper">
<div class="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div class="jp-InputArea jp-Cell-inputArea">
<div class="jp-InputPrompt jp-InputArea-prompt">In [2]:</div>
<div class="jp-CodeMirrorEditor jp-Editor jp-InputArea-editor" data-type="inline">
<div class="CodeMirror cm-s-jupyter">
<div class="highlight hl-ipython3"><pre className='demo-highlight python'><code className='sourceCode'><span><span class="kn">from</span> <span class="nn">pyspark.sql</span> <span class="kn">import</span> <span class="n">SparkSession</span></span>


<br /><span><span class="c1"># Initialize Spark Session</span></span>
<span><span class="n">spark</span> <span class="o">=</span> <span class="n">SparkSession</span><span class="o">.</span><span class="n">builder</span><span class="o">.</span><span class="n">appName</span><span class="p">(</span><span class="s2">"MySparkApp"</span><span class="p">)</span><span class="o">.</span><span class="n">getOrCreate</span><span class="p">()</span></span>
</code></pre></div>
</div>
</div>
</div>
</div>
</div>
<div class="jp-Cell jp-MarkdownCell jp-Notebook-cell">
<div class="jp-Cell-inputWrapper">
<div class="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div class="jp-InputArea jp-Cell-inputArea"><div class="jp-InputPrompt jp-InputArea-prompt">
</div><div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
<h3 id="Using-Spark-Session">Using Spark Session<a class="anchor-link" href="#Using-Spark-Session">¶</a></h3><p>After creation, the Spark Session (<code>spark</code>) can be used to create and manipulate DataFrames, execute SQL queries, and interact with datasets from different data sources.</p>
</div>
</div>
</div>
</div>
<div class="jp-Cell jp-MarkdownCell jp-Notebook-cell">
<div class="jp-Cell-inputWrapper">
<div class="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div class="jp-InputArea jp-Cell-inputArea"><div class="jp-InputPrompt jp-InputArea-prompt">
</div><div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
<h3 id="Stop-a-Spark-Session">Stop a Spark Session<a class="anchor-link" href="#Stop-a-Spark-Session">¶</a></h3><p>It's important to stop the Spark Session when your application is finished to free up resources.
We can use <code>spark.stop()</code> to terminate the session.</p>
</div>
</div>
</div>
</div><div class="jp-Cell jp-CodeCell jp-Notebook-cell jp-mod-noOutputs">
<div class="jp-Cell-inputWrapper">
<div class="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div class="jp-InputArea jp-Cell-inputArea">
<div class="jp-InputPrompt jp-InputArea-prompt">In [3]:</div>
<div class="jp-CodeMirrorEditor jp-Editor jp-InputArea-editor" data-type="inline">
<div class="CodeMirror cm-s-jupyter">
<div class="highlight hl-ipython3"><pre className='demo-highlight python'><code className='sourceCode'><span><span class="n">spark</span><span class="o">.</span><span class="n">stop</span><span class="p">()</span></span>

</code></pre></div>
</div>
</div>
</div>
</div>
</div>
<div class="jp-Cell jp-MarkdownCell jp-Notebook-cell">
<div class="jp-Cell-inputWrapper">
<div class="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div class="jp-InputArea jp-Cell-inputArea"><div class="jp-InputPrompt jp-InputArea-prompt">
</div><div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
<p>Now that you know how to initiate a Spark Session, keep reading the tutorial to learn about PySpark's core data structure: the <Link to="../spark-dataframe">PySpark DataFrame</Link>.</p>
</div>
</div>
</div>
</div>
<div class="jp-Cell jp-MarkdownCell jp-Notebook-cell">
<div class="jp-Cell-inputWrapper">
<div class="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
</div>
<div class="jp-InputArea jp-Cell-inputArea"><div class="jp-InputPrompt jp-InputArea-prompt">
</div><div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
<h3 id="Other-Key-Features-of-Spark-Session"><em>Other Key Features of Spark Session</em><a class="anchor-link" href="#Other-Key-Features-of-Spark-Session">¶</a></h3><p><strong>It's a Unified Entry Point</strong></p>
<ul>
<li>Spark Session consolidates various functionalities of Spark, including Spark SQL, DataFrame, DataSet, and streaming APIs.</li>
<li>It simplifies the process of interacting with different data formats and sources.</li>
</ul>
<p><strong>It's a Replacement for SQLContext and HiveContext</strong></p>
<ul>
<li>Before Spark 2.0, <code>SQLContext</code> and <code>HiveContext</code> were used. Spark Session subsumes these contexts, providing a more streamlined and unified approach.</li>
</ul>
</div>
</div>
</div>
</div>
</div>
)}