import React from 'react'; 
import {Link} from 'react-router-dom'; 
import {useRCustomEffect} from '../../useCustomEffect'; 
import imgGgplot2CreatePoints from '../tutorial/scatterplot_geom_point.png';

export default function Ggplot2CreatePoints(){
useRCustomEffect()
return ( <div>
<div className="page-columns page-rows-contents page-layout-article" id="quarto-content">
<main className="content" id="quarto-document-content">
<header className="quarto-title-block default" id="title-block-header">
<div className="quarto-title">
<h1 className="title">Essentials of Creating a Scatterplot in ggplot2</h1>
</div>
<div className="quarto-title-meta">
</div>
</header>
<p>This tutorial explains <strong>how to create a scatterplot</strong>, covering two critical aspects:</p>
<ul>
<li><a href="#jitter">Introduce random noise to points’ position to unveil overlapped data points</a></li>
<li><a href="#point_shape_color_fill">Dependence of <code>color</code> and <code>fill</code> on the <code>shape</code> aesthetic</a></li>
</ul>
<hr/>
<section className="level4" id="create-a-scatterplot">
<h4 className="anchored" data-anchor-id="create-a-scatterplot">Create a scatterplot</h4>
<p>Use <code>geom_point()</code> to create point elements. Each row in the input dataset corresponds to a point in the plot.</p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb1"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb1-1"><a aria-hidden="true" href="#cb1-1" tabindex="-1"></a><span className="co"># Packages and global theme</span></span>
<span id="cb1-2"><a aria-hidden="true" href="#cb1-2" tabindex="-1"></a><span className="fu">library</span>(ggplot2)</span>
<span id="cb1-3"><a aria-hidden="true" href="#cb1-3" tabindex="-1"></a><span className="fu">library</span>(dplyr)</span>
<span id="cb1-4"><a aria-hidden="true" href="#cb1-4" tabindex="-1"></a><span className="fu">theme_set</span>(<span className="fu">theme_minimal</span>(<span className="at">base_size =</span> <span className="dv">14</span>))</span>
<span id="cb1-5"><a aria-hidden="true" href="#cb1-5" tabindex="-1"></a></span><br/>
<span id="cb1-6"><a aria-hidden="true" href="#cb1-6" tabindex="-1"></a>p <span className="ot">&lt;-</span> chickwts <span className="sc">%&gt;%</span> </span>
<span id="cb1-7"><a aria-hidden="true" href="#cb1-7" tabindex="-1"></a>  <span className="fu">ggplot</span>(<span className="fu">aes</span>(<span className="at">x =</span> feed, <span className="at">y =</span> weight)) </span>
<span id="cb1-8"><a aria-hidden="true" href="#cb1-8" tabindex="-1"></a></span><br/>
<span id="cb1-9"><a aria-hidden="true" href="#cb1-9" tabindex="-1"></a>p <span className="sc">+</span> <span className="fu">geom_point</span>(<span className="at">size =</span> <span className="dv">3</span>, <span className="at">color =</span> <span className="st">"turquoise4"</span>)</span></code></pre></div>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
<figure className="figure">
<p><img className="cover-img" src={imgGgplot2CreatePoints} /></p>
</figure>
</div>
</div>
</div>
</section>
<section className="level4" id="jitter">
<h4 className="anchored" data-anchor-id="jitter">Position in jitter</h4>
<p><code>position = "jitter"</code> in <code>geom_point()</code> introduces a small amount of random noise to the points’ position, and helps to unveil overlapped data points. The code below allows for additional fine-tune of the amount of randomness in both the horizontal (<code>width</code>) and vertical (<code>height</code>) directions. The <code>seed</code> argument takes any random number, and ensures to reproduce the same randomness each time the code is executed.</p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb2"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb2-1"><a aria-hidden="true" href="#cb2-1" tabindex="-1"></a>p <span className="sc">+</span> <span className="fu">geom_point</span>(</span>
<span id="cb2-2"><a aria-hidden="true" href="#cb2-2" tabindex="-1"></a>  <span className="at">position =</span> <span className="fu">position_jitter</span>(<span className="at">width =</span> .<span className="dv">1</span>, <span className="at">height =</span> <span className="dv">10</span>, <span className="at">seed =</span> <span className="dv">123</span>),</span>
<span id="cb2-3"><a aria-hidden="true" href="#cb2-3" tabindex="-1"></a>  <span className="at">size =</span> <span className="dv">3</span>)</span></code></pre></div>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
<figure className="figure">
<p><img className="img-fluid quarto-figure quarto-figure-center figure-img" src="tutorial/scatterplot_jitter.png"/></p>
</figure>
</div>
</div>
</div>
<p><code>geom_jitter()</code> is a shorthand to create jittered position. However, it does <em>not</em> have the <code>seed</code> argument, which has to be specified via <code>position = position_jitter()</code>.</p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb3"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb3-1"><a aria-hidden="true" href="#cb3-1" tabindex="-1"></a>p <span className="sc">+</span> <span className="fu">geom_jitter</span>(<span className="at">width =</span> .<span className="dv">1</span>, <span className="at">height =</span> <span className="dv">10</span>, <span className="at">size =</span> <span className="dv">3</span>)</span></code></pre></div>
</div>
<p>Alternatively, the <Link to="https://github.com/eclarke/ggbeeswarm">ggbeeswarm</Link> package offers randomization in a more organized and symmetrical manner. It has two major functions, <code>geom_beeswarm()</code> and <code>geom_quasirandom()</code>.</p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb4"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb4-1"><a aria-hidden="true" href="#cb4-1" tabindex="-1"></a><span className="co"># install.packages("ggbeeswarm")</span></span>
<span id="cb4-2"><a aria-hidden="true" href="#cb4-2" tabindex="-1"></a><span className="fu">library</span>(ggbeeswarm)</span>
<span id="cb4-3"><a aria-hidden="true" href="#cb4-3" tabindex="-1"></a></span><br/>
<span id="cb4-4"><a aria-hidden="true" href="#cb4-4" tabindex="-1"></a><span className="co"># larger 'cex' value makes points more spread apart</span></span>
<span id="cb4-5"><a aria-hidden="true" href="#cb4-5" tabindex="-1"></a>p <span className="sc">+</span> <span className="fu">geom_beeswarm</span>(<span className="at">cex =</span> <span className="dv">3</span>, <span className="at">size =</span> <span className="dv">3</span>)</span></code></pre></div>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
<figure className="figure">
<p><img className="img-fluid quarto-figure quarto-figure-center figure-img" src="tutorial/scatterplot_beeswarm.png"/></p>
</figure>
</div>
</div>
</div>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb5"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb5-1"><a aria-hidden="true" href="#cb5-1" tabindex="-1"></a><span className="co"># larger 'width' value makes points more spread apart</span></span>
<span id="cb5-2"><a aria-hidden="true" href="#cb5-2" tabindex="-1"></a>p <span className="sc">+</span> <span className="fu">geom_quasirandom</span>(<span className="at">size =</span> <span className="dv">3</span>, <span className="at">width =</span> .<span className="dv">2</span>) </span></code></pre></div>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
<figure className="figure">
<p><img className="img-fluid quarto-figure quarto-figure-center figure-img" src="tutorial/scatterplot_quasirandom.png"/></p>
</figure>
</div>
</div>
</div>
</section>
<section className="level4" id="point_shape_color_fill">
<h4 className="anchored">Dependence of aesthetics <code>color</code> and <code>fill</code> on <code>shape</code></h4>
<p>In ggplot2, each shape is represented with a fixed number index. The following script displays the number assignment to each shape.</p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb6"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb6-1"><a aria-hidden="true" href="#cb6-1" tabindex="-1"></a><span className="co"># create a data frame specifying the coordinate position of each point</span></span>
<span id="cb6-2"><a aria-hidden="true" href="#cb6-2" tabindex="-1"></a>d <span className="ot">&lt;-</span> <span className="fu">rbind</span>(<span className="fu">expand.grid</span>(<span className="dv">1</span><span className="sc">:</span><span className="dv">5</span>, <span className="dv">5</span><span className="sc">:</span><span className="dv">1</span>), </span>
<span id="cb6-3"><a aria-hidden="true" href="#cb6-3" tabindex="-1"></a>           <span className="fu">data.frame</span>(<span className="at">Var1 =</span> <span className="dv">6</span>, <span className="at">Var2 =</span> <span className="dv">1</span>)) </span>
<span id="cb6-4"><a aria-hidden="true" href="#cb6-4" tabindex="-1"></a></span><br/>
<span id="cb6-5"><a aria-hidden="true" href="#cb6-5" tabindex="-1"></a><span className="co"># demonstrate points each with a different shape</span></span>
<span id="cb6-6"><a aria-hidden="true" href="#cb6-6" tabindex="-1"></a><span className="fu">ggplot</span>(d, <span className="fu">aes</span>(Var1, Var2)) <span className="sc">+</span></span>
<span id="cb6-7"><a aria-hidden="true" href="#cb6-7" tabindex="-1"></a>  <span className="co"># points of different shapes</span></span>
<span id="cb6-8"><a aria-hidden="true" href="#cb6-8" tabindex="-1"></a>  <span className="fu">geom_point</span>(<span className="at">shape =</span> <span className="dv">0</span><span className="sc">:</span><span className="dv">25</span>, <span className="at">size =</span> <span className="dv">6</span>, </span>
<span id="cb6-9"><a aria-hidden="true" href="#cb6-9" tabindex="-1"></a>             <span className="at">stroke =</span> <span className="dv">2</span>, <span className="co"># thickness of the outline.  </span></span>
<span id="cb6-10"><a aria-hidden="true" href="#cb6-10" tabindex="-1"></a>             <span className="at">color =</span> <span className="st">"steelblue3"</span>, <span className="at">fill =</span> <span className="st">"gold"</span>) <span className="sc">+</span></span>
<span id="cb6-11"><a aria-hidden="true" href="#cb6-11" tabindex="-1"></a>  </span>
<span id="cb6-12"><a aria-hidden="true" href="#cb6-12" tabindex="-1"></a>  <span className="co"># mark the number associated with the shape</span></span>
<span id="cb6-13"><a aria-hidden="true" href="#cb6-13" tabindex="-1"></a>  <span className="fu">geom_text</span>(<span className="fu">aes</span>(<span className="at">label =</span> <span className="dv">0</span><span className="sc">:</span><span className="dv">25</span>), </span>
<span id="cb6-14"><a aria-hidden="true" href="#cb6-14" tabindex="-1"></a>            <span className="at">nudge_y =</span> .<span className="dv">4</span>,</span>
<span id="cb6-15"><a aria-hidden="true" href="#cb6-15" tabindex="-1"></a>            <span className="at">size =</span> <span className="dv">6</span>, <span className="at">fontface =</span> <span className="st">"bold"</span>) <span className="sc">+</span></span>
<span id="cb6-16"><a aria-hidden="true" href="#cb6-16" tabindex="-1"></a>  </span>
<span id="cb6-17"><a aria-hidden="true" href="#cb6-17" tabindex="-1"></a>  <span className="fu">theme_void</span>() <span className="co"># apply an empty background</span></span></code></pre></div>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
<figure className="figure">
<p><img className="img-fluid quarto-figure quarto-figure-center figure-img" src="tutorial/scatterplot_number_shape.png"/></p>
</figure>
</div>
</div>
</div>
<ul>
<li><p>shapes 0 ~ 14 are outlines. 15 ~ 20 are solid shapes. All shapes 0 ~ 20 are specified by the <code>color</code> aesthetic .</p></li>
<li><p>Shapes 21 ~ 25 each have an outline, specified by <code>color</code>; and an interior, controlled by <code>fill</code>.</p></li>
</ul>
<p>To illustrate the dependence of <code>color</code> and <code>fill</code> on the <code>shape</code> aesthetic, compare the following two lines of script. If the <code>feed</code> variable is mapped to <code>fill</code>, instead of <code>color</code>, the points are all black. This is because the shapes in the current plots are sketched in <em>outlines</em>, which corresponds only to the <code>color</code> aesthetic; as such, the shapes do <em>not</em> understand the <code>fill</code> aesthetic.</p>
<div className="sourceCode cell-code" id="cb7"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb7-1"><a aria-hidden="true" href="#cb7-1" tabindex="-1"></a><span className="co"># colorful points</span></span>
<span id="cb7-2"><a aria-hidden="true" href="#cb7-2" tabindex="-1"></a>p <span className="sc">+</span> <span className="fu">geom_point</span>(<span className="fu">aes</span>(<span className="at">shape =</span> feed, <span className="at">color =</span> feed), <span className="at">size =</span> <span className="dv">3</span>)</span>
<span id="cb7-3"><a aria-hidden="true" href="#cb7-3" tabindex="-1"></a><span className="co"># black points</span></span>
<span id="cb7-4"><a aria-hidden="true" href="#cb7-4" tabindex="-1"></a>p <span className="sc">+</span> <span className="fu">geom_point</span>(<span className="fu">aes</span>(<span className="at">shape =</span> feed, <span className="at">fill =</span> feed),  <span className="at">size =</span> <span className="dv">3</span>)</span></code></pre></div>
<div className="cell quarto-layout-panel" data-layout-align="center" data-layout-ncol="2">
<div className="quarto-layout-row">
<div className="quarto-layout-cell">
<div className="quarto-figure quarto-figure-center">
<figure className="figure">
<p><img className="img-fluid quarto-figure quarto-figure-center figure-img" src="tutorial/scatterplot_color_fill.png"/></p>
</figure>
</div>
</div>
<div className="quarto-layout-cell">
<div className="quarto-figure quarto-figure-center">
<figure className="figure">
<p><img className="img-fluid quarto-figure quarto-figure-center figure-img" src="tutorial/scatterplot_color_fill-2.png"/></p>
</figure>
</div>
</div>
</div>
</div>
<p>Shape <code>21</code> has both an outline, specified by <code>color</code> aesthetic, and a solid interior, which is controlled by <code>fill</code> aesthetic. In the following plot, the point interior <code>fill</code> is mapped with <code>feed</code> and color-coated, and the outline (not mapped with any variable) takes the default black color. The <code>stroke</code> argument specifies the thickness of borders.</p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb8"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb8-1"><a aria-hidden="true" href="#cb8-1" tabindex="-1"></a>p <span className="sc">+</span> <span className="fu">geom_point</span>(<span className="fu">aes</span>(<span className="at">fill =</span> feed), <span className="at">shape =</span> <span className="dv">21</span>, <span className="at">size =</span> <span className="dv">3</span>, <span className="at">stroke =</span> <span className="dv">1</span>)</span></code></pre></div>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
<figure className="figure">
<p><img className="img-fluid quarto-figure quarto-figure-center figure-img" src="tutorial/scatterplot_shape21.png"/></p>
</figure>
</div>
</div>
</div>
<hr/>
<h3 className="anchored" data-anchor-id="point_shape_color_fill">
<strong><em>Continue Exploring — 🚀 one level up!</em></strong>
</h3>
<p>Overcrowdedness in scatterplot is a common problem when visualizing large datasets, and makes it difficult to unveil the underlying data pattern. Check out <Link to="../1-ggplot2-overcrowded-data-points"><strong>this article</strong></Link> to learn powerful techniques to deal with this common issue.</p>
</section>
</main>
</div>
</div>
)}