import React from 'react'; 
import {Link} from 'react-router-dom'; 
import {useRCustomEffect} from '../../useCustomEffect'; 
import AddTabsetQuarto from '../../js/addCodeFoldingTabforQuarto'; 
import imgGgplot2ScatterplotUrbanization from '../graphics/scatterplot_urbanization_ggrepel_completed.png'; 
import imgGgplot2ScatterplotUrbanizationWebp from '../graphics/scatterplot_urbanization_ggrepel_completed.webp'; 
export default function Ggplot2ScatterplotUrbanization(){
useRCustomEffect()
AddTabsetQuarto()
return ( <div>
<div className="page-columns page-rows-contents page-layout-article" id="quarto-content">
<main className="content" id="quarto-document-content">
<header className="quarto-title-block default" id="title-block-header">
<div className="quarto-title">
<h1 className="title">Create Scatterplot on Semi-log Scale in ggplot2 to Visualize GDP vs. Urbanization</h1>
</div>
<div className="quarto-title-meta">
</div>
</header>
  <picture>
    <source type="image/webp" srcset={imgGgplot2ScatterplotUrbanizationWebp} />
    <img className="cover-img" src={imgGgplot2ScatterplotUrbanization} />
  </picture>

<p>In this article, we’ll create a scatterplot to display the linear relationship between the log10 of GDP per capita and the percent of urbanization across different countries.</p>
<p><strong>Major techniques covered in this visualization include:</strong></p>
<ul>
<li><a href="#point_shape">Point shape customization.</a></li>
<li><a href="#aesthetic_inheritance_override">Override of aesthetic inheritance in the legend.</a></li>
<li><a href="#logarithmic_axis">Axis with logarithmic scale and annotation.</a></li>
<li><a href="#texts_minimal_overlap">Label texts with reduced overlap.</a></li>
</ul>
<hr/>
<div className="tabset-margin-container"></div><div className="panel-tabset">
<ul className="nav nav-tabs" role="tablist"><li className="nav-item" role="presentation"><a aria-controls="tabset-1-1" aria-selected="true" className="nav-link active" data-bs-target="#tabset-1-1" data-bs-toggle="tab" href="" id="tabset-1-1-tab" role="tab">Stepwise instructions</a></li><li className="nav-item" role="presentation"><a aria-controls="tabset-1-2" aria-selected="false" className="nav-link" data-bs-target="#tabset-1-2" data-bs-toggle="tab" href="" id="tabset-1-2-tab" role="tab">Code only</a></li></ul>
<div className="tab-content">
<div aria-labelledby="tabset-1-1-tab" className="tab-pane active" id="tabset-1-1" role="tabpanel">
<section className="level3" id="packages-and-data-cleanup">
<h3 className="anchored" data-anchor-id="packages-and-data-cleanup">Packages and data cleanup</h3>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb1"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb1-1"><a aria-hidden="true" href="#cb1-1" tabindex="-1"></a><span className="fu">library</span>(ggplot2)</span>
<span id="cb1-2"><a aria-hidden="true" href="#cb1-2" tabindex="-1"></a><span className="fu">library</span>(dplyr)</span>
<span id="cb1-3"><a aria-hidden="true" href="#cb1-3" tabindex="-1"></a><span className="fu">library</span>(ggrepel) <span className="co"># text with minimized overlap</span></span>
<span id="cb1-4"><a aria-hidden="true" href="#cb1-4" tabindex="-1"></a></span><br/>
<span id="cb1-5"><a aria-hidden="true" href="#cb1-5" tabindex="-1"></a><span className="co"># install.packages("carData")</span></span>
<span id="cb1-6"><a aria-hidden="true" href="#cb1-6" tabindex="-1"></a><span className="fu">library</span>(carData) <span className="co"># dataset package</span></span>
<span id="cb1-7"><a aria-hidden="true" href="#cb1-7" tabindex="-1"></a></span><br/>
<span id="cb1-8"><a aria-hidden="true" href="#cb1-8" tabindex="-1"></a>UN2 <span className="ot">&lt;-</span> UN <span className="sc">%&gt;%</span> <span className="fu">filter</span>(<span className="sc">!</span> <span className="fu">is.na</span>(region)) <span className="co"># %&gt;% as_tibble()</span></span>
<span id="cb1-9"><a aria-hidden="true" href="#cb1-9" tabindex="-1"></a></span><br/>
<span id="cb1-10"><a aria-hidden="true" href="#cb1-10" tabindex="-1"></a><span className="co"># display first 3 rows in tibble format</span></span>
<span id="cb1-11"><a aria-hidden="true" href="#cb1-11" tabindex="-1"></a><span className="fu">as_tibble</span>(UN2) <span className="sc">%&gt;%</span> <span className="fu">head</span>(<span className="at">n =</span> <span className="dv">3</span>)</span></code></pre></div>
<div className="cell-output cell-output-stdout">
<pre className="demo-highlight sourceCode r rcss"><code className="sourceCode r"># A tibble: 3 × 7
<br/>  region group  fertility ppgdp lifeExpF pctUrban infantMortality
<br/>  &lt;fct&gt;  &lt;fct&gt;      &lt;dbl&gt; &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;           &lt;dbl&gt;
<br/>1 Asia   other       5.97  499      49.5       23           125. 
<br/>2 Europe other       1.52 3677.     80.4       53            16.6
<br/>3 Africa africa      2.14 4473      75         67            21.5</code></pre>
</div>
</div>
</section>
<section className="level3" id="visualization">
<h3 className="anchored" data-anchor-id="visualization">Visualization</h3>
<p><span id="point_shape"><strong>Create a simple scatterplot.</strong> We use point shape of <code>21</code> so that the point has a circular outline and a filled interior (check out more <Link to="../../R/visualization/18-ggplot2-create-points">basics of points</Link>). The color of the outline is controlled by aesthetic <code>color</code>, and the color of the interior controlled by aesthetic <code>fill</code>. As the same color scale (<code>"Set3"</code>) is applied to both <code>color</code> and <code>fill</code> aesthetics, the outline and interior merge to present a solid point (as desired). The purpose of creating the attributes of outline and interior is to enable their visualization in the legend keys at the next step.</span></p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb3"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb3-1"><a aria-hidden="true" href="#cb3-1" tabindex="-1"></a>p1 <span className="ot">&lt;-</span> UN2 <span className="sc">%&gt;%</span> </span>
<span id="cb3-2"><a aria-hidden="true" href="#cb3-2" tabindex="-1"></a>  <span className="fu">ggplot</span>(<span className="fu">aes</span>(<span className="at">x =</span> pctUrban, <span className="at">y =</span> ppgdp, <span className="at">color =</span> region, <span className="at">fill =</span> region)) <span className="sc">+</span> </span>
<span id="cb3-3"><a aria-hidden="true" href="#cb3-3" tabindex="-1"></a>  <span className="fu">geom_point</span>(<span className="at">shape =</span> <span className="dv">21</span>) <span className="sc">+</span></span>
<span id="cb3-4"><a aria-hidden="true" href="#cb3-4" tabindex="-1"></a>  <span className="co"># color scale</span></span>
<span id="cb3-5"><a aria-hidden="true" href="#cb3-5" tabindex="-1"></a>  <span className="fu">scale_color_brewer</span>(<span className="at">palette =</span> <span className="st">"Set3"</span>, <span className="at">name =</span> <span className="st">""</span>) <span className="sc">+</span> <span className="co"># remove legend title</span></span>
<span id="cb3-6"><a aria-hidden="true" href="#cb3-6" tabindex="-1"></a>  <span className="fu">scale_fill_brewer</span>(<span className="at">palette =</span> <span className="st">"Set3"</span>, <span className="at">name =</span> <span className="st">""</span>) </span>
<span id="cb3-7"><a aria-hidden="true" href="#cb3-7" tabindex="-1"></a></span><br/>
<span id="cb3-8"><a aria-hidden="true" href="#cb3-8" tabindex="-1"></a>p1</span></code></pre></div>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
<figure className="figure">
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics/scatterplot_urbanization_base.webp" />
    <img className="img-fluid quarto-figure quarto-figure-center figure-img" src="graphics/scatterplot_urbanization_base"/>
  </picture>
</figure>
</div>
</div>
</div>
<p><span id="aesthetic_inheritance_override"><strong>Enhance the legend.</strong> The legend keys inherit the aesthetic properties of the associated <code>geom_*</code> by default. In this example, the legend keys inherit the outline, fill, shape, and size, etc., of <code>geom_point()</code>, and are drawn as small solid points. To make the legends visually more prominent, here we <strong><em>override</em> the aesthetic inheritance in the legend</strong>, and sketch out a black outline and increase the size of the legend keys.</span></p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb4"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb4-1"><a aria-hidden="true" href="#cb4-1" tabindex="-1"></a>p2 <span className="ot">&lt;-</span> p1 <span className="sc">+</span> <span className="fu">guides</span>(<span className="at">color =</span> <span className="fu">guide_legend</span>(</span>
<span id="cb4-2"><a aria-hidden="true" href="#cb4-2" tabindex="-1"></a>  <span className="at">override.aes =</span> <span className="fu">list</span>(<span className="at">color =</span> <span className="st">"black"</span>, <span className="at">size =</span> <span className="dv">5</span>))) </span>
<span id="cb4-3"><a aria-hidden="true" href="#cb4-3" tabindex="-1"></a>p2</span></code></pre></div>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
<figure className="figure">
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics/scatterplot_urbanization_legend_override.webp" />
    <img className="img-fluid quarto-figure quarto-figure-center figure-img" src="graphics/scatterplot_urbanization_legend_override"/>
  </picture>
</figure>
</div>
</div>
</div>
<p><span id="logarithmic_axis"><strong>Transform y-axis to logarithmic scale at base 10, and add log-based ticks.</strong> This unveils a roughly linear relationship between log(y) and x. Now the transformed y-axis progresses exponentially (how to read <Link to="https://en.wikipedia.org/wiki/Logarithmic_scale">log-scale</Link>). (Check this <Link to="../ggplot2-scatterplot-diamonds">diamonds’ scatterplot</Link> with both axes log-transformed at base 2)</span></p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb5"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb5-1"><a aria-hidden="true" href="#cb5-1" tabindex="-1"></a>p3 <span className="ot">&lt;-</span> p2 <span className="sc">+</span> <span className="fu">scale_y_log10</span>(</span>
<span id="cb5-2"><a aria-hidden="true" href="#cb5-2" tabindex="-1"></a>  <span className="co"># set breaks (of the main grids)</span></span>
<span id="cb5-3"><a aria-hidden="true" href="#cb5-3" tabindex="-1"></a>  <span className="at">breaks =</span> <span className="fu">c</span>(<span className="dv">100</span>, <span className="dv">500</span>, <span className="dv">1000</span>, <span className="dv">5000</span>, <span className="dv">10</span><span className="sc">^</span><span className="dv">4</span>, <span className="dv">5</span><span className="sc">*</span><span className="dv">10</span><span className="sc">^</span><span className="dv">4</span>, <span className="dv">10</span><span className="sc">^</span><span className="dv">5</span>),</span>
<span id="cb5-4"><a aria-hidden="true" href="#cb5-4" tabindex="-1"></a>  <span className="at">labels =</span> <span className="cf">function</span>(x) &#123;<span className="fu">paste</span>(x<span className="sc">/</span><span className="dv">1000</span>, <span className="st">"K"</span>)&#125; ) <span className="sc">+</span></span>
<span id="cb5-5"><a aria-hidden="true" href="#cb5-5" tabindex="-1"></a>  </span>
<span id="cb5-6"><a aria-hidden="true" href="#cb5-6" tabindex="-1"></a>  <span className="co"># add log-10 scale ticks; </span></span>
<span id="cb5-7"><a aria-hidden="true" href="#cb5-7" tabindex="-1"></a>  <span className="co"># note the scale tick space is not evenly distributed</span></span>
<span id="cb5-8"><a aria-hidden="true" href="#cb5-8" tabindex="-1"></a>  <span className="fu">annotation_logticks</span>(<span className="at">sides =</span> <span className="st">"l"</span>, <span className="at">colour =</span> <span className="st">"white"</span>) </span>
<span id="cb5-9"><a aria-hidden="true" href="#cb5-9" tabindex="-1"></a></span><br/>
<span id="cb5-10"><a aria-hidden="true" href="#cb5-10" tabindex="-1"></a>p3</span></code></pre></div>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
<figure className="figure">
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics/scatterplot_urbanization_log_scale.webp" />
    <img className="img-fluid quarto-figure quarto-figure-center figure-img" src="graphics/scatterplot_urbanization_log_scale"/>
  </picture>
</figure>
</div>
</div>
</div>
<p>By default, a minor grid line is drawn in the middle of two major grid lines, informative in linear scale but not so useful in logarithmic scales. We’ll remove these minor grids (of the y axis) at the following step.</p>
<p><strong>Customize the theme, and remove the useless minor grids on the y axis.</strong></p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb6"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb6-1"><a aria-hidden="true" href="#cb6-1" tabindex="-1"></a>p4 <span className="ot">&lt;-</span> p3 <span className="sc">+</span>  </span>
<span id="cb6-2"><a aria-hidden="true" href="#cb6-2" tabindex="-1"></a>  <span className="co"># add titles</span></span>
<span id="cb6-3"><a aria-hidden="true" href="#cb6-3" tabindex="-1"></a>  <span className="fu">labs</span>(</span>
<span id="cb6-4"><a aria-hidden="true" href="#cb6-4" tabindex="-1"></a>    <span className="at">y =</span> <span className="st">"GDP per capita (US $)"</span>, </span>
<span id="cb6-5"><a aria-hidden="true" href="#cb6-5" tabindex="-1"></a>    <span className="at">x =</span> <span className="st">"Urbanization percent"</span>,</span>
<span id="cb6-6"><a aria-hidden="true" href="#cb6-6" tabindex="-1"></a>    <span className="at">title =</span> <span className="st">"  UN National Statistics, 2009–2011"</span>) <span className="sc">+</span></span>
<span id="cb6-7"><a aria-hidden="true" href="#cb6-7" tabindex="-1"></a>  </span>
<span id="cb6-8"><a aria-hidden="true" href="#cb6-8" tabindex="-1"></a>  <span className="co"># theme</span></span>
<span id="cb6-9"><a aria-hidden="true" href="#cb6-9" tabindex="-1"></a>  <span className="fu">theme_minimal</span>(<span className="at">base_size =</span> <span className="dv">15</span>) <span className="sc">+</span></span>
<span id="cb6-10"><a aria-hidden="true" href="#cb6-10" tabindex="-1"></a>  <span className="fu">theme</span>(</span>
<span id="cb6-11"><a aria-hidden="true" href="#cb6-11" tabindex="-1"></a>    <span className="co"># remove minor grids, which is not meaningful in log scale</span></span>
<span id="cb6-12"><a aria-hidden="true" href="#cb6-12" tabindex="-1"></a>    <span className="at">panel.grid.minor =</span> <span className="fu">element_blank</span>(),</span>
<span id="cb6-13"><a aria-hidden="true" href="#cb6-13" tabindex="-1"></a>    </span>
<span id="cb6-14"><a aria-hidden="true" href="#cb6-14" tabindex="-1"></a>    <span className="at">panel.background =</span> <span className="fu">element_rect</span>(<span className="at">fill =</span> <span className="st">"black"</span>),</span>
<span id="cb6-15"><a aria-hidden="true" href="#cb6-15" tabindex="-1"></a>    <span className="at">panel.grid =</span> <span className="fu">element_line</span>(<span className="at">color =</span> <span className="st">"grey30"</span>),</span>
<span id="cb6-16"><a aria-hidden="true" href="#cb6-16" tabindex="-1"></a>    <span className="co"># use 'vjust' to sink plot title downward </span></span>
<span id="cb6-17"><a aria-hidden="true" href="#cb6-17" tabindex="-1"></a>    <span className="at">plot.title =</span> <span className="fu">element_text</span>(<span className="at">vjust =</span> <span className="sc">-</span><span className="dv">6</span>, <span className="at">color =</span> <span className="st">"snow3"</span>)) </span>
<span id="cb6-18"><a aria-hidden="true" href="#cb6-18" tabindex="-1"></a></span><br/>
<span id="cb6-19"><a aria-hidden="true" href="#cb6-19" tabindex="-1"></a>p4</span></code></pre></div>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
<figure className="figure">
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics/scatterplot_urbanization_theme.webp" />
    <img className="img-fluid quarto-figure quarto-figure-center figure-img" src="graphics/scatterplot_urbanization_theme"/>
  </picture>
</figure>
</div>
</div>
</div>
<p><strong>Draw a regression line, calculated based on transformed data</strong>, i.e., log(y) and x. Due to aesthetic inheritance of <code>color = region</code> and <code>fill = region</code> from the <code>ggplot()</code> line, a regression line would be created separately for each region. Here however we overwrite such inheritance by assigning <em>fixed values</em> to <code>color</code> and <code>fill</code>. This results in a <em>single</em> regression created based on the entire dataset, roughly equivalent to specifying <code>aes(group = 1)</code> in <code>geom_smooth()</code>.</p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb7"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb7-1"><a aria-hidden="true" href="#cb7-1" tabindex="-1"></a>p5 <span className="ot">&lt;-</span> p4 <span className="sc">+</span>  </span>
<span id="cb7-2"><a aria-hidden="true" href="#cb7-2" tabindex="-1"></a>  <span className="fu">geom_smooth</span>(<span className="at">method =</span> <span className="st">"lm"</span>, <span className="at">color =</span> <span className="st">"white"</span>, </span>
<span id="cb7-3"><a aria-hidden="true" href="#cb7-3" tabindex="-1"></a>              <span className="at">fill =</span> <span className="st">"beige"</span>, <span className="at">alpha =</span> .<span className="dv">4</span>)</span>
<span id="cb7-4"><a aria-hidden="true" href="#cb7-4" tabindex="-1"></a>p5</span></code></pre></div>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
<figure className="figure">
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics/scatterplot_urbanization_regression.webp" />
    <img className="img-fluid quarto-figure quarto-figure-center figure-img" src="graphics/scatterplot_urbanization_regression"/>
  </picture>
</figure>
</div>
</div>
</div>
<p><span id="texts_minimal_overlap"><strong>Label the name of countries using the <Link to="https://ggrepel.slowkow.com/"><code>ggrepel</code></Link> package.</strong> The main function <code>geom_text_repel()</code> works similarly as <code>geom_point()</code>, but adds text labels, instead of points. It automatically repel text labels from each other and from the dots to reduce overlap, and ensures that all texts are plotted within the plot boundary.</span></p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb8"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb8-1"><a aria-hidden="true" href="#cb8-1" tabindex="-1"></a><span className="co"># label with country name, with minimized overlap</span></span>
<span id="cb8-2"><a aria-hidden="true" href="#cb8-2" tabindex="-1"></a>p6 <span className="ot">&lt;-</span> p5 <span className="sc">+</span> </span>
<span id="cb8-3"><a aria-hidden="true" href="#cb8-3" tabindex="-1"></a>  <span className="fu">geom_text_repel</span>(<span className="fu">aes</span>(<span className="at">label =</span> <span className="fu">rownames</span>(UN2)),</span>
<span id="cb8-4"><a aria-hidden="true" href="#cb8-4" tabindex="-1"></a>                  <span className="at">box.padding =</span> <span className="fu">unit</span>(<span className="dv">0</span>, <span className="st">"pt"</span>),</span>
<span id="cb8-5"><a aria-hidden="true" href="#cb8-5" tabindex="-1"></a>                  <span className="at">max.overlaps =</span> <span className="cn">Inf</span>,</span>
<span id="cb8-6"><a aria-hidden="true" href="#cb8-6" tabindex="-1"></a>                  <span className="at">size =</span> <span className="fl">2.3</span>, <span className="at">show.legend =</span> F) </span>
<span id="cb8-7"><a aria-hidden="true" href="#cb8-7" tabindex="-1"></a>p6</span></code></pre></div>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
<figure className="figure">
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics/scatterplot_urbanization_ggrepel_completed.webp" />
    <img className="img-fluid quarto-figure quarto-figure-center figure-img" src="graphics/scatterplot_urbanization_ggrepel_completed"/>
  </picture>
</figure>
</div>
</div>
</div>
</section>
</div>
<div aria-labelledby="tabset-1-2-tab" className="tab-pane" id="tabset-1-2" role="tabpanel">
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb9"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb9-1"><a aria-hidden="true" href="#cb9-1" tabindex="-1"></a><span className="fu">library</span>(ggplot2)</span>
<span id="cb9-2"><a aria-hidden="true" href="#cb9-2" tabindex="-1"></a><span className="fu">library</span>(dplyr)</span>
<span id="cb9-3"><a aria-hidden="true" href="#cb9-3" tabindex="-1"></a><span className="fu">library</span>(ggrepel) <span className="co"># text with minimized overlap</span></span>
<span id="cb9-4"><a aria-hidden="true" href="#cb9-4" tabindex="-1"></a></span><br/>
<span id="cb9-5"><a aria-hidden="true" href="#cb9-5" tabindex="-1"></a><span className="co"># install.packages("carData")</span></span>
<span id="cb9-6"><a aria-hidden="true" href="#cb9-6" tabindex="-1"></a><span className="fu">library</span>(carData) <span className="co"># dataset package</span></span>
<span id="cb9-7"><a aria-hidden="true" href="#cb9-7" tabindex="-1"></a></span><br/>
<span id="cb9-8"><a aria-hidden="true" href="#cb9-8" tabindex="-1"></a>UN2 <span className="ot">&lt;-</span> UN <span className="sc">%&gt;%</span> <span className="fu">filter</span>(<span className="sc">!</span> <span className="fu">is.na</span>(region)) <span className="co"># %&gt;% as_tibble()</span></span>
<span id="cb9-9"><a aria-hidden="true" href="#cb9-9" tabindex="-1"></a></span><br/>
<span id="cb9-10"><a aria-hidden="true" href="#cb9-10" tabindex="-1"></a><span className="co"># display first 3 rows in tibble format</span></span>
<span id="cb9-11"><a aria-hidden="true" href="#cb9-11" tabindex="-1"></a><span className="fu">as_tibble</span>(UN2) <span className="sc">%&gt;%</span> <span className="fu">head</span>(<span className="at">n =</span> <span className="dv">3</span>)</span>
<span id="cb9-12"><a aria-hidden="true" href="#cb9-12" tabindex="-1"></a></span><br/>
<span id="cb9-13"><a aria-hidden="true" href="#cb9-13" tabindex="-1"></a></span><br/>
<span id="cb9-14"><a aria-hidden="true" href="#cb9-14" tabindex="-1"></a><span className="co"># Create a scatter plot</span></span>
<span id="cb9-15"><a aria-hidden="true" href="#cb9-15" tabindex="-1"></a>p1 <span className="ot">&lt;-</span> UN2 <span className="sc">%&gt;%</span> </span>
<span id="cb9-16"><a aria-hidden="true" href="#cb9-16" tabindex="-1"></a>  <span className="fu">ggplot</span>(<span className="fu">aes</span>(<span className="at">x =</span> pctUrban, <span className="at">y =</span> ppgdp, <span className="at">color =</span> region, <span className="at">fill =</span> region)) <span className="sc">+</span> </span>
<span id="cb9-17"><a aria-hidden="true" href="#cb9-17" tabindex="-1"></a>  <span className="fu">geom_point</span>(<span className="at">shape =</span> <span className="dv">21</span>) <span className="sc">+</span></span>
<span id="cb9-18"><a aria-hidden="true" href="#cb9-18" tabindex="-1"></a>  <span className="co"># color scale</span></span>
<span id="cb9-19"><a aria-hidden="true" href="#cb9-19" tabindex="-1"></a>  <span className="fu">scale_color_brewer</span>(<span className="at">palette =</span> <span className="st">"Set3"</span>, </span>
<span id="cb9-20"><a aria-hidden="true" href="#cb9-20" tabindex="-1"></a>                     <span className="at">name =</span> <span className="st">""</span>) <span className="sc">+</span>   <span className="co"># remove legend title</span></span>
<span id="cb9-21"><a aria-hidden="true" href="#cb9-21" tabindex="-1"></a>  <span className="fu">scale_fill_brewer</span>(<span className="at">palette =</span> <span className="st">"Set3"</span>, <span className="at">name =</span> <span className="st">""</span>) </span>
<span id="cb9-22"><a aria-hidden="true" href="#cb9-22" tabindex="-1"></a></span><br/>
<span id="cb9-23"><a aria-hidden="true" href="#cb9-23" tabindex="-1"></a>p1</span>
<span id="cb9-24"><a aria-hidden="true" href="#cb9-24" tabindex="-1"></a></span><br/>
<span id="cb9-25"><a aria-hidden="true" href="#cb9-25" tabindex="-1"></a></span><br/>
<span id="cb9-26"><a aria-hidden="true" href="#cb9-26" tabindex="-1"></a><span className="co"># Update the legend keys: </span></span>
<span id="cb9-27"><a aria-hidden="true" href="#cb9-27" tabindex="-1"></a><span className="co"># Sketch out a black outline, and increase their size</span></span>
<span id="cb9-28"><a aria-hidden="true" href="#cb9-28" tabindex="-1"></a>p2 <span className="ot">&lt;-</span> p1 <span className="sc">+</span> <span className="fu">guides</span>(<span className="at">color =</span> <span className="fu">guide_legend</span>(</span>
<span id="cb9-29"><a aria-hidden="true" href="#cb9-29" tabindex="-1"></a>  <span className="at">override.aes =</span> <span className="fu">list</span>(<span className="at">color =</span> <span className="st">"black"</span>, <span className="at">size =</span> <span className="dv">5</span>))) </span>
<span id="cb9-30"><a aria-hidden="true" href="#cb9-30" tabindex="-1"></a>p2</span>
<span id="cb9-31"><a aria-hidden="true" href="#cb9-31" tabindex="-1"></a></span><br/>
<span id="cb9-32"><a aria-hidden="true" href="#cb9-32" tabindex="-1"></a></span><br/>
<span id="cb9-33"><a aria-hidden="true" href="#cb9-33" tabindex="-1"></a><span className="co"># Transform y-axis to logarithmic scale, and add log-based ticks. </span></span>
<span id="cb9-34"><a aria-hidden="true" href="#cb9-34" tabindex="-1"></a>p3 <span className="ot">&lt;-</span> p2 <span className="sc">+</span> <span className="fu">scale_y_log10</span>(</span>
<span id="cb9-35"><a aria-hidden="true" href="#cb9-35" tabindex="-1"></a>  <span className="co"># set breaks (of the main grids)</span></span>
<span id="cb9-36"><a aria-hidden="true" href="#cb9-36" tabindex="-1"></a>  <span className="at">breaks =</span> <span className="fu">c</span>(<span className="dv">100</span>, <span className="dv">500</span>, <span className="dv">1000</span>, <span className="dv">5000</span>, <span className="dv">10</span><span className="sc">^</span><span className="dv">4</span>, <span className="dv">5</span><span className="sc">*</span><span className="dv">10</span><span className="sc">^</span><span className="dv">4</span>, <span className="dv">10</span><span className="sc">^</span><span className="dv">5</span>),</span>
<span id="cb9-37"><a aria-hidden="true" href="#cb9-37" tabindex="-1"></a>  <span className="at">labels =</span> <span className="cf">function</span>(x) &#123;<span className="fu">paste</span>(x<span className="sc">/</span><span className="dv">1000</span>, <span className="st">"K"</span>)&#125; ) <span className="sc">+</span></span>
<span id="cb9-38"><a aria-hidden="true" href="#cb9-38" tabindex="-1"></a>  </span>
<span id="cb9-39"><a aria-hidden="true" href="#cb9-39" tabindex="-1"></a>  <span className="co"># add log-10 scale ticks; </span></span>
<span id="cb9-40"><a aria-hidden="true" href="#cb9-40" tabindex="-1"></a>  <span className="co"># note the scale tick space is not evenly distributed</span></span>
<span id="cb9-41"><a aria-hidden="true" href="#cb9-41" tabindex="-1"></a>  <span className="fu">annotation_logticks</span>(<span className="at">sides =</span> <span className="st">"l"</span>, <span className="at">colour =</span> <span className="st">"white"</span>) </span>
<span id="cb9-42"><a aria-hidden="true" href="#cb9-42" tabindex="-1"></a>p3</span>
<span id="cb9-43"><a aria-hidden="true" href="#cb9-43" tabindex="-1"></a></span><br/>
<span id="cb9-44"><a aria-hidden="true" href="#cb9-44" tabindex="-1"></a></span><br/>
<span id="cb9-45"><a aria-hidden="true" href="#cb9-45" tabindex="-1"></a><span className="co"># Add plot titles, and customize the theme. </span></span>
<span id="cb9-46"><a aria-hidden="true" href="#cb9-46" tabindex="-1"></a><span className="co"># And remove the useless minor grids on the y axis.</span></span>
<span id="cb9-47"><a aria-hidden="true" href="#cb9-47" tabindex="-1"></a>p4 <span className="ot">&lt;-</span> p3 <span className="sc">+</span>  </span>
<span id="cb9-48"><a aria-hidden="true" href="#cb9-48" tabindex="-1"></a>  <span className="co"># add titles</span></span>
<span id="cb9-49"><a aria-hidden="true" href="#cb9-49" tabindex="-1"></a>  <span className="fu">labs</span>(</span>
<span id="cb9-50"><a aria-hidden="true" href="#cb9-50" tabindex="-1"></a>    <span className="at">y =</span> <span className="st">"GDP per capita (US $)"</span>, </span>
<span id="cb9-51"><a aria-hidden="true" href="#cb9-51" tabindex="-1"></a>    <span className="at">x =</span> <span className="st">"Urbanization percent"</span>,</span>
<span id="cb9-52"><a aria-hidden="true" href="#cb9-52" tabindex="-1"></a>    <span className="at">title =</span> <span className="st">"  UN National Statistics, 2009–2011"</span>) <span className="sc">+</span></span>
<span id="cb9-53"><a aria-hidden="true" href="#cb9-53" tabindex="-1"></a>  </span>
<span id="cb9-54"><a aria-hidden="true" href="#cb9-54" tabindex="-1"></a>  <span className="co"># theme</span></span>
<span id="cb9-55"><a aria-hidden="true" href="#cb9-55" tabindex="-1"></a>  <span className="fu">theme_minimal</span>(<span className="at">base_size =</span> <span className="dv">15</span>) <span className="sc">+</span></span>
<span id="cb9-56"><a aria-hidden="true" href="#cb9-56" tabindex="-1"></a>  <span className="fu">theme</span>(</span>
<span id="cb9-57"><a aria-hidden="true" href="#cb9-57" tabindex="-1"></a>    <span className="co"># remove minor grids, which are not meaningful in log scale</span></span>
<span id="cb9-58"><a aria-hidden="true" href="#cb9-58" tabindex="-1"></a>    <span className="at">panel.grid.minor =</span> <span className="fu">element_blank</span>(),</span>
<span id="cb9-59"><a aria-hidden="true" href="#cb9-59" tabindex="-1"></a>    </span>
<span id="cb9-60"><a aria-hidden="true" href="#cb9-60" tabindex="-1"></a>    <span className="at">panel.background =</span> <span className="fu">element_rect</span>(<span className="at">fill =</span> <span className="st">"black"</span>),</span>
<span id="cb9-61"><a aria-hidden="true" href="#cb9-61" tabindex="-1"></a>    <span className="at">panel.grid =</span> <span className="fu">element_line</span>(<span className="at">color =</span> <span className="st">"grey30"</span>),</span>
<span id="cb9-62"><a aria-hidden="true" href="#cb9-62" tabindex="-1"></a>    <span className="co"># use 'vjust' to sink plot title downward </span></span>
<span id="cb9-63"><a aria-hidden="true" href="#cb9-63" tabindex="-1"></a>    <span className="at">plot.title =</span> <span className="fu">element_text</span>(<span className="at">vjust =</span> <span className="sc">-</span><span className="dv">6</span>, <span className="at">color =</span> <span className="st">"snow3"</span>)) </span>
<span id="cb9-64"><a aria-hidden="true" href="#cb9-64" tabindex="-1"></a></span><br/>
<span id="cb9-65"><a aria-hidden="true" href="#cb9-65" tabindex="-1"></a>p4</span>
<span id="cb9-66"><a aria-hidden="true" href="#cb9-66" tabindex="-1"></a></span><br/>
<span id="cb9-67"><a aria-hidden="true" href="#cb9-67" tabindex="-1"></a></span><br/>
<span id="cb9-68"><a aria-hidden="true" href="#cb9-68" tabindex="-1"></a><span className="co"># Add regression line. The regression is calculated based on log(y) and x. </span></span>
<span id="cb9-69"><a aria-hidden="true" href="#cb9-69" tabindex="-1"></a>p5 <span className="ot">&lt;-</span> p4 <span className="sc">+</span>  </span>
<span id="cb9-70"><a aria-hidden="true" href="#cb9-70" tabindex="-1"></a>  <span className="fu">geom_smooth</span>(<span className="at">method =</span> <span className="st">"lm"</span>, <span className="at">color =</span> <span className="st">"white"</span>, </span>
<span id="cb9-71"><a aria-hidden="true" href="#cb9-71" tabindex="-1"></a>              <span className="at">fill =</span> <span className="st">"beige"</span>, <span className="at">alpha =</span> .<span className="dv">4</span>)</span>
<span id="cb9-72"><a aria-hidden="true" href="#cb9-72" tabindex="-1"></a>p5</span>
<span id="cb9-73"><a aria-hidden="true" href="#cb9-73" tabindex="-1"></a></span><br/>
<span id="cb9-74"><a aria-hidden="true" href="#cb9-74" tabindex="-1"></a></span><br/>
<span id="cb9-75"><a aria-hidden="true" href="#cb9-75" tabindex="-1"></a><span className="co"># Label the name of the countries using the 'ggrepel' package with reduced overlap. </span></span>
<span id="cb9-76"><a aria-hidden="true" href="#cb9-76" tabindex="-1"></a>p6 <span className="ot">&lt;-</span> p5 <span className="sc">+</span> </span>
<span id="cb9-77"><a aria-hidden="true" href="#cb9-77" tabindex="-1"></a>  <span className="fu">geom_text_repel</span>(<span className="fu">aes</span>(<span className="at">label =</span> <span className="fu">rownames</span>(UN2)),</span>
<span id="cb9-78"><a aria-hidden="true" href="#cb9-78" tabindex="-1"></a>                  <span className="at">box.padding =</span> <span className="fu">unit</span>(<span className="dv">0</span>, <span className="st">"pt"</span>),</span>
<span id="cb9-79"><a aria-hidden="true" href="#cb9-79" tabindex="-1"></a>                  <span className="at">max.overlaps =</span> <span className="cn">Inf</span>,</span>
<span id="cb9-80"><a aria-hidden="true" href="#cb9-80" tabindex="-1"></a>                  <span className="at">size =</span> <span className="fl">2.3</span>, <span className="at">show.legend =</span> F) </span>
<span id="cb9-81"><a aria-hidden="true" href="#cb9-81" tabindex="-1"></a>p6</span></code></pre></div>
</div>
</div>
</div>
</div>
<p><br/></p>
<hr/>
<p><br/></p>
<h3 className="anchored">
<strong><em>Continue Exploring — 🚀 one level up!</em></strong>
</h3>
<p><br/></p>
<p>For data with high skewness, mathematical transformations are a powerful tool to aid in visualizing the underlying data structure, as shown above. Such transformations are also commonly leveraged in the color scale. Check out the following awesome <Link to="../ggplot2-heatmap-African-population"><strong>heatmap on African population density</strong></Link>, with critical <strong>pseudo-logarithmic transformation in the color scale to unveil highly skewed data pattern</strong>.</p>
<p><Link to="../ggplot2-heatmap-African-population">
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics/map_African_population_completed.webp" />
    <img className="img-fluid" src="https://s3.amazonaws.com/databrewer/media/graphics/map_African_population_completed.png" />
  </picture>
</Link></p>
<p><br/><br/></p>
<p>A scatterplot is often enhanced by visualizing the marginal (univariate) distribution of the x and y variables, and the bivariate distribution pattern with confidence ellipses. Check out the following <Link to="../ggplot2-penguin-scatterplot-ellipse-ggExtra"><strong>scatterplot with marginal and ellipses visualization</strong></Link>.</p>
<p><Link to="../ggplot2-penguin-scatterplot-ellipse-ggExtra">
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics/penguin_ggExtra_completed.webp" />
    <img className="img-fluid" src="https://s3.amazonaws.com/databrewer/media/graphics/penguin_ggExtra_completed.png" />
  </picture>
</Link></p>
<p><br/><br/></p>
<p>Furthermore, check here to learn how to <Link to="../ggplot2-scatterplot-techniques"><strong>encircle and highlight selected points</strong></Link>.</p>
<p><Link to="../ggplot2-scatterplot-techniques">
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics/penguins_theme_completed.webp" />
    <img className="img-fluid" src="https://s3.amazonaws.com/databrewer/media/graphics/penguins_theme_completed.png" />
  </picture>
</Link></p>
</main>
</div>
</div>
)}