import React from 'react'; 
import {Link} from 'react-router-dom'; 
import {useRCustomEffect} from '../../useCustomEffect'; 
import imgPseudoLogTransformation from '../graphics_blog/pseudo_log_cover.png'; 
import imgPseudoLogTransformationWebp from '../graphics_blog/pseudo_log_cover.webp'; 
export default function PseudoLogTransformation(){
useRCustomEffect()
return ( <div>
<div className="page-columns page-rows-contents page-layout-article" id="quarto-content">
<main className="content" id="quarto-document-content">
<header className="quarto-title-block default" id="title-block-header">
<div className="quarto-title">
<h1 className="title">Pseudo-Logarithm in Data Visualization</h1>
<p className="subtitle lead">A Mostly Unknown Yet Powerful Method for Data Processing</p>
</div>
<div className="quarto-title-meta">
</div>
</header>
  <picture>
    <source type="image/webp" srcset={imgPseudoLogTransformationWebp} />
    <img className="cover-img" src={imgPseudoLogTransformation} />
  </picture>

<p>Not many people have heard of this hidden gem of <em>pseudo-logarithmic transformation</em>. In this article, I’ll introduce you to this fantastic tool of data processing, and demonstrate how it adds a magic touch to the classic logarithm to create stunning data visuals, especially for <Link to="/blog/visualize-skewed-data">skewed data</Link>.</p>
<hr/>

<section className="level3" id="why-use-pseudo-logarithm">
<h3 className="anchored" data-anchor-id="why-use-pseudo-logarithm">Why Use Pseudo-Logarithm?</h3>
<p>The classic logarithm is not defined for zero and negative values. This limits its use in many applications. In comparison, <strong>pseudo-logarithm fixes this limit of the classic logarithm</strong>: defined for all real numbers, it employs a signed logarithm for large absolute values, and transitions smoothly to zero as the underlying values approach zero.</p>
</section>
<section className="level3" id="what_is_pseudo_log">
<h3 className="anchored" data-anchor-id="what_is_pseudo_log">What Is Pseudo-Logarithm at all?</h3>
<p>Pseudo-logarithm of base 10 (pseudo-log10) is defined as</p>
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics_blog/pseudo_log10.webp" />
    <img className="img-fluid" src="graphics_blog/pseudo_log10.png" data-fallback="graphics_blog/pseudo_log10.png" />
  </picture>

<p>This equation involves the hyperbolic sine, <em>sinh</em>, with</p>
<p><img className="img-fluid" src="graphics_blog/hyperbolic_sinh.png"/> and <em>arsinh</em> is its inverse function, with</p>
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics_blog/hyperbolic_arsinh.webp" />
    <img className="img-fluid" src="graphics_blog/hyperbolic_arsinh.png" data-fallback="graphics_blog/hyperbolic_arsinh.png" />
  </picture>

<p>In the plot below, values on the x-axis is transformed by <strong>pseudo-log10</strong> and mapped to the y-axis, depicted as the <strong>blue</strong> line. In comparison, the <strong>classic log10</strong>-transformation is drawn as the <strong>black</strong> curve.</p>
<div className="cell" data-layout-align="center">
<details className="code-fold">
<summary>Show the code</summary>
<div className="sourceCode cell-code" id="cb1"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb1-1"><a aria-hidden="true" href="#cb1-1" tabindex="-1"></a><span className="fu">library</span>(tidyverse)</span>
<span id="cb1-2"><a aria-hidden="true" href="#cb1-2" tabindex="-1"></a></span><br/>
<span id="cb1-3"><a aria-hidden="true" href="#cb1-3" tabindex="-1"></a>pseudoLog10 <span className="ot">&lt;-</span> <span className="cf">function</span>(x) &#123; </span>
<span id="cb1-4"><a aria-hidden="true" href="#cb1-4" tabindex="-1"></a>  <span className="fu">log</span>( (x<span className="sc">/</span><span className="dv">2</span>) <span className="sc">+</span> <span className="fu">sqrt</span> ((x<span className="sc">/</span><span className="dv">2</span>)<span className="sc">^</span><span className="dv">2</span> <span className="sc">+</span> <span className="dv">1</span>) ) <span className="sc">/</span> <span className="fu">log</span>(<span className="dv">10</span>)</span>
<span id="cb1-5"><a aria-hidden="true" href="#cb1-5" tabindex="-1"></a>  <span className="co"># Alternatively, you can use the built-in function 'asinh'</span></span>
<span id="cb1-6"><a aria-hidden="true" href="#cb1-6" tabindex="-1"></a>  <span className="co"># asinh(x/2)/log(10)</span></span>
<span id="cb1-7"><a aria-hidden="true" href="#cb1-7" tabindex="-1"></a>&#125;</span>
<span id="cb1-8"><a aria-hidden="true" href="#cb1-8" tabindex="-1"></a></span><br/>
<span id="cb1-9"><a aria-hidden="true" href="#cb1-9" tabindex="-1"></a>x <span className="ot">&lt;-</span> <span className="fu">seq</span>(<span className="sc">-</span><span className="dv">12</span>, <span className="dv">12</span>, .<span className="dv">05</span>) <span className="co"># x-axis</span></span>
<span id="cb1-10"><a aria-hidden="true" href="#cb1-10" tabindex="-1"></a>y <span className="ot">&lt;-</span> <span className="fu">log</span>(x, <span className="at">base =</span> <span className="dv">10</span>) <span className="co"># map to classic log10</span></span>
<span id="cb1-11"><a aria-hidden="true" href="#cb1-11" tabindex="-1"></a>z <span className="ot">&lt;-</span> <span className="fu">pseudoLog10</span>(x)    <span className="co"># map to pseudo-log10</span></span>
<span id="cb1-12"><a aria-hidden="true" href="#cb1-12" tabindex="-1"></a></span><br/>
<span id="cb1-13"><a aria-hidden="true" href="#cb1-13" tabindex="-1"></a><span className="co"># plot</span></span>
<span id="cb1-14"><a aria-hidden="true" href="#cb1-14" tabindex="-1"></a>p <span className="ot">&lt;-</span> <span className="fu">tibble</span>(x, y, z) <span className="sc">%&gt;%</span> </span>
<span id="cb1-15"><a aria-hidden="true" href="#cb1-15" tabindex="-1"></a>  <span className="fu">ggplot</span>(<span className="fu">aes</span>(x)) <span className="sc">+</span> </span>
<span id="cb1-16"><a aria-hidden="true" href="#cb1-16" tabindex="-1"></a>  <span className="fu">geom_line</span>(<span className="fu">aes</span>(<span className="at">y =</span> y)) <span className="sc">+</span> <span className="co"># classic log10 as black line</span></span>
<span id="cb1-17"><a aria-hidden="true" href="#cb1-17" tabindex="-1"></a>  <span className="fu">geom_line</span>(<span className="fu">aes</span>(<span className="at">y =</span> z), <span className="at">color =</span> <span className="st">"#0E70C0"</span>) <span className="sc">+</span> <span className="co"># pseudo-log10 as red line</span></span>
<span id="cb1-18"><a aria-hidden="true" href="#cb1-18" tabindex="-1"></a>  </span>
<span id="cb1-19"><a aria-hidden="true" href="#cb1-19" tabindex="-1"></a>  <span className="co"># set up axis scale</span></span>
<span id="cb1-20"><a aria-hidden="true" href="#cb1-20" tabindex="-1"></a>  <span className="fu">coord_cartesian</span>(<span className="at">xlim =</span> <span className="fu">c</span>(<span className="sc">-</span><span className="dv">10</span>, <span className="dv">10</span>), <span className="at">ylim =</span> <span className="fu">c</span>(<span className="sc">-</span><span className="dv">1</span>, <span className="dv">1</span>)) <span className="sc">+</span></span>
<span id="cb1-21"><a aria-hidden="true" href="#cb1-21" tabindex="-1"></a>  <span className="fu">scale_x_continuous</span>(<span className="at">breaks =</span> <span className="fu">seq</span>(<span className="sc">-</span><span className="dv">10</span>, <span className="dv">10</span>, <span className="dv">2</span>)) <span className="sc">+</span></span>
<span id="cb1-22"><a aria-hidden="true" href="#cb1-22" tabindex="-1"></a>  <span className="fu">scale_y_continuous</span>(<span className="at">breaks =</span> <span className="fu">seq</span>(<span className="sc">-</span><span className="dv">1</span>, <span className="dv">1</span>, .<span className="dv">2</span>)) <span className="sc">+</span></span>
<span id="cb1-23"><a aria-hidden="true" href="#cb1-23" tabindex="-1"></a>  </span>
<span id="cb1-24"><a aria-hidden="true" href="#cb1-24" tabindex="-1"></a>  <span className="co"># label the curves with transformation names</span></span>
<span id="cb1-25"><a aria-hidden="true" href="#cb1-25" tabindex="-1"></a>  <span className="fu">annotate</span>(</span>
<span id="cb1-26"><a aria-hidden="true" href="#cb1-26" tabindex="-1"></a>    <span className="at">geom =</span> <span className="st">"text"</span>, </span>
<span id="cb1-27"><a aria-hidden="true" href="#cb1-27" tabindex="-1"></a>    <span className="at">x =</span> <span className="fu">c</span>(<span className="sc">-</span><span className="dv">6</span>, <span className="fl">3.5</span>), </span>
<span id="cb1-28"><a aria-hidden="true" href="#cb1-28" tabindex="-1"></a>    <span className="at">y =</span> <span className="sc">-</span>.<span className="dv">3</span>, </span>
<span id="cb1-29"><a aria-hidden="true" href="#cb1-29" tabindex="-1"></a>    <span className="at">size =</span> <span className="dv">4</span>, </span>
<span id="cb1-30"><a aria-hidden="true" href="#cb1-30" tabindex="-1"></a>    <span className="at">fontface =</span> <span className="st">"bold"</span>,</span>
<span id="cb1-31"><a aria-hidden="true" href="#cb1-31" tabindex="-1"></a>    <span className="at">label =</span> <span className="fu">c</span>(<span className="st">"pseudo-log10"</span>, <span className="st">"classic log10"</span>),</span>
<span id="cb1-32"><a aria-hidden="true" href="#cb1-32" tabindex="-1"></a>    <span className="at">color =</span> <span className="fu">c</span>(<span className="st">"#0E70C0"</span>, <span className="st">"black"</span>)) <span className="sc">+</span></span>
<span id="cb1-33"><a aria-hidden="true" href="#cb1-33" tabindex="-1"></a>  </span>
<span id="cb1-34"><a aria-hidden="true" href="#cb1-34" tabindex="-1"></a>  <span className="fu">theme_minimal</span>(<span className="at">base_size =</span> <span className="dv">14</span>) <span className="sc">+</span></span>
<span id="cb1-35"><a aria-hidden="true" href="#cb1-35" tabindex="-1"></a>  <span className="fu">geom_hline</span>(<span className="at">yintercept =</span> <span className="dv">0</span>, <span className="at">color =</span> <span className="st">"tomato"</span>, <span className="at">alpha =</span> .<span className="dv">4</span>) <span className="sc">+</span></span>
<span id="cb1-36"><a aria-hidden="true" href="#cb1-36" tabindex="-1"></a>  <span className="fu">geom_vline</span>(<span className="at">xintercept =</span> <span className="dv">0</span>, <span className="at">color =</span> <span className="st">"tomato"</span>, <span className="at">alpha =</span> .<span className="dv">4</span>) <span className="sc">+</span></span>
<span id="cb1-37"><a aria-hidden="true" href="#cb1-37" tabindex="-1"></a>  </span>
<span id="cb1-38"><a aria-hidden="true" href="#cb1-38" tabindex="-1"></a>  <span className="co"># mark critical points</span></span>
<span id="cb1-39"><a aria-hidden="true" href="#cb1-39" tabindex="-1"></a>  <span className="fu">annotate</span>(<span className="at">geom =</span> <span className="st">"point"</span>,</span>
<span id="cb1-40"><a aria-hidden="true" href="#cb1-40" tabindex="-1"></a>           <span className="at">x =</span> <span className="fu">c</span>(<span className="sc">-</span><span className="dv">10</span>, <span className="dv">0</span>, <span className="dv">1</span>, <span className="dv">10</span>), </span>
<span id="cb1-41"><a aria-hidden="true" href="#cb1-41" tabindex="-1"></a>           <span className="at">y =</span> <span className="fu">c</span>(<span className="sc">-</span><span className="dv">1</span>, <span className="dv">0</span>, <span className="dv">0</span>, <span className="dv">1</span>),</span>
<span id="cb1-42"><a aria-hidden="true" href="#cb1-42" tabindex="-1"></a>           <span className="at">size =</span> <span className="dv">2</span>) </span>
<span id="cb1-43"><a aria-hidden="true" href="#cb1-43" tabindex="-1"></a>p</span></code></pre></div>
</details>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics_blog/pseudo_log_curve.webp" />
    <img className="img-fluid figure-img" src="graphics_blog/pseudo_log_curve.png" data-fallback="graphics_blog/pseudo_log_curve.png" />
  </picture>

</div>
</div>
</div>
<p><strong>This plot shows some nice properties of pseudo-log transformation:</strong></p>
<ul>
<li>pseudo log10(x) is defined for all real numbers, and monotonically increasing.</li>
<li>pseudo log10(0) = 0</li>
<li>pseudo log10(-x) = - pseudo log10(x).</li>
<li>If x ≫ 0, pseudo-log10(x) ≈ log10(x),</li>
<li>If x ≪ 0, pseudo-log10(x) ≈ −log10(|x|)</li>
</ul>
<p>In like manner, <strong>pseudo-logarithm of any base <em>b</em> (pseudo-log <em>b</em>) can be defined as</strong></p>
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics_blog/pseudo_logb.webp" />
    <img className="img-fluid" src="graphics_blog/pseudo_logb.png" data-fallback="graphics_blog/pseudo_logb.png" />
  </picture>

<p>Pseudo-log<em>b</em>(x) has the following properties:</p>
<ul>
<li><p>pseudo-log<em>b</em>(0) = 0</p></li>
<li><p>If x ≫ 0, pseudo-log<em>b</em>(x) ≈ log<em>b</em>(x),</p></li>
<li><p>If x ≪ 0, pseudo-log<em>b</em>(x) ≈ −log<em>b</em>(|x|)</p></li>
</ul>
</section>
<section className="level3" id="pseudoLog_visual">
<h3 className="anchored" data-anchor-id="pseudoLog_visual">Pseudo-Logarithm in Data Visualization</h3>
</section>

<section className="level3" id="pseudoLog_histogram">
<h3 className="anchored" data-anchor-id="pseudoLog_histogram"><em>a) pseudo-log transform in histogram</em></h3>
<p>The African population has a very skewed distribution. If we divide Africa into grids of latitude and longitude, and count the population in each cell, we can plot the population distribution as a histogram below.</p>
<div className="cell" data-layout-align="center">
<details className="code-fold">
<summary>Show the code</summary>
<div className="sourceCode cell-code" id="cb2"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb2-1"><a aria-hidden="true" href="#cb2-1" tabindex="-1"></a><span className="co"># the African population dataset</span></span>
<span id="cb2-2"><a aria-hidden="true" href="#cb2-2" tabindex="-1"></a><span className="co"># install.packages("remotes") # if not already installed</span></span>
<span id="cb2-3"><a aria-hidden="true" href="#cb2-3" tabindex="-1"></a><span className="co"># remotes::install_github("afrimapr/afrilearndata") # if fails, try restarting R</span></span>
<span id="cb2-4"><a aria-hidden="true" href="#cb2-4" tabindex="-1"></a><span className="fu">library</span>(afrilearndata) </span>
<span id="cb2-5"><a aria-hidden="true" href="#cb2-5" tabindex="-1"></a><span className="co"># packages for data cleanup</span></span>
<span id="cb2-6"><a aria-hidden="true" href="#cb2-6" tabindex="-1"></a><span className="fu">library</span>(raster) </span>
<span id="cb2-7"><a aria-hidden="true" href="#cb2-7" tabindex="-1"></a><span className="fu">library</span>(sp)</span>
<span id="cb2-8"><a aria-hidden="true" href="#cb2-8" tabindex="-1"></a></span><br/>
<span id="cb2-9"><a aria-hidden="true" href="#cb2-9" tabindex="-1"></a><span className="co"># Data cleanup</span></span>
<span id="cb2-10"><a aria-hidden="true" href="#cb2-10" tabindex="-1"></a>afripop_df <span className="ot">&lt;-</span> afripop2020 <span className="sc">%&gt;%</span> </span>
<span id="cb2-11"><a aria-hidden="true" href="#cb2-11" tabindex="-1"></a>  <span className="fu">as.data.frame</span>(<span className="at">xy =</span> <span className="cn">TRUE</span>) <span className="sc">%&gt;%</span> </span>
<span id="cb2-12"><a aria-hidden="true" href="#cb2-12" tabindex="-1"></a>  <span className="fu">rename</span>(<span className="at">pop =</span> <span className="dv">3</span>) <span className="sc">%&gt;%</span> </span>
<span id="cb2-13"><a aria-hidden="true" href="#cb2-13" tabindex="-1"></a>  <span className="fu">filter</span>(<span className="sc">!</span><span className="fu">is.na</span>(pop)) <span className="sc">%&gt;%</span> </span>
<span id="cb2-14"><a aria-hidden="true" href="#cb2-14" tabindex="-1"></a>  <span className="fu">as_tibble</span>()</span>
<span id="cb2-15"><a aria-hidden="true" href="#cb2-15" tabindex="-1"></a></span><br/>
<span id="cb2-16"><a aria-hidden="true" href="#cb2-16" tabindex="-1"></a><span className="co"># mark some fixed population values for ease of comparison</span></span>
<span id="cb2-17"><a aria-hidden="true" href="#cb2-17" tabindex="-1"></a>myBreaks <span className="ot">&lt;-</span> <span className="fu">c</span>(<span className="dv">0</span>, .<span className="dv">1</span>, <span className="dv">1</span>, <span className="dv">10</span>, <span className="dv">100</span>, <span className="dv">1000</span>, <span className="dv">5000</span>, <span className="dv">10000</span>, <span className="dv">20000</span>)</span>
<span id="cb2-18"><a aria-hidden="true" href="#cb2-18" tabindex="-1"></a></span><br/>
<span id="cb2-19"><a aria-hidden="true" href="#cb2-19" tabindex="-1"></a><span className="co"># create a function creating histograms with specified transformation</span></span>
<span id="cb2-20"><a aria-hidden="true" href="#cb2-20" tabindex="-1"></a>hist <span className="ot">&lt;-</span> <span className="cf">function</span>(</span>
<span id="cb2-21"><a aria-hidden="true" href="#cb2-21" tabindex="-1"></a>    <span className="at">transformation =</span> <span className="st">"identity"</span>, </span>
<span id="cb2-22"><a aria-hidden="true" href="#cb2-22" tabindex="-1"></a>    <span className="at">title =</span> <span className="st">"no transform"</span></span>
<span id="cb2-23"><a aria-hidden="true" href="#cb2-23" tabindex="-1"></a>)&#123;</span>
<span id="cb2-24"><a aria-hidden="true" href="#cb2-24" tabindex="-1"></a>  p <span className="ot">&lt;-</span> afripop_df <span className="sc">%&gt;%</span> </span>
<span id="cb2-25"><a aria-hidden="true" href="#cb2-25" tabindex="-1"></a>    <span className="fu">ggplot</span>(<span className="fu">aes</span>(<span className="at">x =</span> pop)) <span className="sc">+</span> </span>
<span id="cb2-26"><a aria-hidden="true" href="#cb2-26" tabindex="-1"></a>    <span className="fu">geom_histogram</span>(<span className="at">bins =</span>  <span className="dv">100</span>) <span className="sc">+</span></span>
<span id="cb2-27"><a aria-hidden="true" href="#cb2-27" tabindex="-1"></a>    <span className="fu">scale_x_continuous</span>(</span>
<span id="cb2-28"><a aria-hidden="true" href="#cb2-28" tabindex="-1"></a>      <span className="at">transform =</span> transformation,</span>
<span id="cb2-29"><a aria-hidden="true" href="#cb2-29" tabindex="-1"></a>      <span className="at">breaks =</span> myBreaks,</span>
<span id="cb2-30"><a aria-hidden="true" href="#cb2-30" tabindex="-1"></a>      <span className="at">labels =</span> scales<span className="sc">::</span>comma,</span>
<span id="cb2-31"><a aria-hidden="true" href="#cb2-31" tabindex="-1"></a>      <span className="at">minor_breaks =</span> <span className="cn">NULL</span>,</span>
<span id="cb2-32"><a aria-hidden="true" href="#cb2-32" tabindex="-1"></a>      <span className="at">name =</span> <span className="cn">NULL</span></span>
<span id="cb2-33"><a aria-hidden="true" href="#cb2-33" tabindex="-1"></a>    ) <span className="sc">+</span></span>
<span id="cb2-34"><a aria-hidden="true" href="#cb2-34" tabindex="-1"></a>    <span className="fu">theme_bw</span>() <span className="sc">+</span></span>
<span id="cb2-35"><a aria-hidden="true" href="#cb2-35" tabindex="-1"></a>    <span className="fu">theme</span>(</span>
<span id="cb2-36"><a aria-hidden="true" href="#cb2-36" tabindex="-1"></a>      <span className="at">legend.position =</span> <span className="st">"bottom"</span>,</span>
<span id="cb2-37"><a aria-hidden="true" href="#cb2-37" tabindex="-1"></a>      <span className="at">plot.title =</span> <span className="fu">element_text</span>(<span className="at">hjust =</span> .<span className="dv">5</span>, <span className="at">face =</span> <span className="st">"bold"</span>, <span className="at">size =</span> <span className="dv">17</span>, <span className="at">color =</span> <span className="st">"turquoise4"</span>),</span>
<span id="cb2-38"><a aria-hidden="true" href="#cb2-38" tabindex="-1"></a>      <span className="at">panel.grid.major.y =</span> <span className="fu">element_blank</span>(),</span>
<span id="cb2-39"><a aria-hidden="true" href="#cb2-39" tabindex="-1"></a>      <span className="at">panel.grid.minor.y =</span> <span className="fu">element_blank</span>(),</span>
<span id="cb2-40"><a aria-hidden="true" href="#cb2-40" tabindex="-1"></a>      <span className="at">axis.text.x =</span> <span className="fu">element_text</span>(<span className="at">angle =</span> <span className="dv">90</span>, <span className="at">hjust =</span> <span className="dv">1</span>),</span>
<span id="cb2-41"><a aria-hidden="true" href="#cb2-41" tabindex="-1"></a>      <span className="at">panel.border =</span> <span className="fu">element_blank</span>(),</span>
<span id="cb2-42"><a aria-hidden="true" href="#cb2-42" tabindex="-1"></a>      <span className="at">plot.background =</span> <span className="fu">element_rect</span>(<span className="at">color =</span> <span className="st">"black"</span>)</span>
<span id="cb2-43"><a aria-hidden="true" href="#cb2-43" tabindex="-1"></a>    ) <span className="sc">+</span></span>
<span id="cb2-44"><a aria-hidden="true" href="#cb2-44" tabindex="-1"></a>    <span className="fu">ggtitle</span>(title)</span>
<span id="cb2-45"><a aria-hidden="true" href="#cb2-45" tabindex="-1"></a>  <span className="fu">return</span>(p)</span>
<span id="cb2-46"><a aria-hidden="true" href="#cb2-46" tabindex="-1"></a>&#125;</span>
<span id="cb2-47"><a aria-hidden="true" href="#cb2-47" tabindex="-1"></a></span><br/>
<span id="cb2-48"><a aria-hidden="true" href="#cb2-48" tabindex="-1"></a><span className="co"># Draw histograms with specified transformations</span></span>
<span id="cb2-49"><a aria-hidden="true" href="#cb2-49" tabindex="-1"></a>h1 <span className="ot">&lt;-</span> <span className="fu">hist</span>() </span>
<span id="cb2-50"><a aria-hidden="true" href="#cb2-50" tabindex="-1"></a>h2 <span className="ot">&lt;-</span> <span className="fu">hist</span>(<span className="at">transformation =</span> <span className="st">"log"</span>, <span className="at">title =</span> <span className="st">"log"</span>) </span>
<span id="cb2-51"><a aria-hidden="true" href="#cb2-51" tabindex="-1"></a>h3 <span className="ot">&lt;-</span> <span className="fu">hist</span>(<span className="at">transformation =</span> <span className="st">"pseudo_log"</span>, <span className="at">title =</span> <span className="st">"pseudo-log"</span>) </span>
<span id="cb2-52"><a aria-hidden="true" href="#cb2-52" tabindex="-1"></a>h4 <span className="ot">&lt;-</span> <span className="fu">hist</span>(<span className="at">transformation =</span> <span className="st">"log1p"</span>, <span className="at">title =</span> <span className="st">"log ( 1 + x )"</span>)</span>
<span id="cb2-53"><a aria-hidden="true" href="#cb2-53" tabindex="-1"></a></span><br/>
<span id="cb2-54"><a aria-hidden="true" href="#cb2-54" tabindex="-1"></a><span className="co"># Plot all together</span></span>
<span id="cb2-55"><a aria-hidden="true" href="#cb2-55" tabindex="-1"></a>cowplot<span className="sc">::</span><span className="fu">plot_grid</span>(h1, h2, h3, h4, <span className="at">nrow =</span> <span className="dv">2</span>)</span></code></pre></div>
</details>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics_blog/pseudo_log_histogram.webp" />
    <img className="img-fluid figure-img" src="graphics_blog/pseudo_log_histogram.png" data-fallback="graphics_blog/pseudo_log_histogram.png" />
  </picture>

</div>
</div>
</div>
<p>Most places have a very low population density (including zero), and only a small number of cells contain a large population covering a wide numeric range. In comparison, the <strong>classic log</strong>, <strong>pseudo-log</strong>, and <strong>log(1+x)</strong> transformation remedies the skewness by various extent. (The impact of transformation to the numeric ZERO is not that obvious yet.)</p>
</section>

<section className="level3" id="pseudoLog_heatmap">
<h3 className="anchored" data-anchor-id="pseudoLog_heatmap"><em>b) pseudo-log transform in heatmap</em></h3>
<p>The impact of transformations is mostly profound when visualized on a color scale, such as the heatmap below (with the classic <Link to="https://www.databrewer.co/R/visualization/16-ggplot2-color-viridis-palette">viridis</Link> palette).</p>
<div className="cell" data-layout-align="center">
<details className="code-fold">
<summary>Show the code</summary>
<div className="sourceCode cell-code" id="cb3"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb3-1"><a aria-hidden="true" href="#cb3-1" tabindex="-1"></a><span className="co"># Create a function creating heatmaps with specified transformation</span></span>
<span id="cb3-2"><a aria-hidden="true" href="#cb3-2" tabindex="-1"></a>heat <span className="ot">&lt;-</span> <span className="cf">function</span>(data, </span>
<span id="cb3-3"><a aria-hidden="true" href="#cb3-3" tabindex="-1"></a>                 transformation, </span>
<span id="cb3-4"><a aria-hidden="true" href="#cb3-4" tabindex="-1"></a>                 <span className="at">title =</span> <span className="st">"no transform"</span>)&#123;</span>
<span id="cb3-5"><a aria-hidden="true" href="#cb3-5" tabindex="-1"></a>  data <span className="sc">%&gt;%</span> </span>
<span id="cb3-6"><a aria-hidden="true" href="#cb3-6" tabindex="-1"></a>    <span className="fu">ggplot</span>() <span className="sc">+</span></span>
<span id="cb3-7"><a aria-hidden="true" href="#cb3-7" tabindex="-1"></a>    <span className="co"># Draw heatmaps</span></span>
<span id="cb3-8"><a aria-hidden="true" href="#cb3-8" tabindex="-1"></a>    <span className="fu">geom_raster</span>(<span className="fu">aes</span>(x, y, <span className="at">fill =</span> pop)) <span className="sc">+</span></span>
<span id="cb3-9"><a aria-hidden="true" href="#cb3-9" tabindex="-1"></a>    <span className="fu">coord_fixed</span>(<span className="at">ratio =</span> <span className="fl">1.1</span>) <span className="sc">+</span></span>
<span id="cb3-10"><a aria-hidden="true" href="#cb3-10" tabindex="-1"></a>    <span className="co"># Adjust color scale with transformation</span></span>
<span id="cb3-11"><a aria-hidden="true" href="#cb3-11" tabindex="-1"></a>    <span className="fu">scale_fill_viridis_c</span>(</span>
<span id="cb3-12"><a aria-hidden="true" href="#cb3-12" tabindex="-1"></a>      <span className="at">trans =</span> transformation, </span>
<span id="cb3-13"><a aria-hidden="true" href="#cb3-13" tabindex="-1"></a>      <span className="at">option =</span> <span className="st">"B"</span>,</span>
<span id="cb3-14"><a aria-hidden="true" href="#cb3-14" tabindex="-1"></a>      <span className="at">breaks =</span> myBreaks, </span>
<span id="cb3-15"><a aria-hidden="true" href="#cb3-15" tabindex="-1"></a>      <span className="at">labels =</span> myBreaks) <span className="sc">+</span></span>
<span id="cb3-16"><a aria-hidden="true" href="#cb3-16" tabindex="-1"></a>    <span className="fu">ggtitle</span>(title)  <span className="sc">+</span></span>
<span id="cb3-17"><a aria-hidden="true" href="#cb3-17" tabindex="-1"></a>    <span className="co"># adjust color bar style</span></span>
<span id="cb3-18"><a aria-hidden="true" href="#cb3-18" tabindex="-1"></a>    <span className="fu">guides</span>(<span className="at">fill =</span> <span className="fu">guide_colorbar</span>(</span>
<span id="cb3-19"><a aria-hidden="true" href="#cb3-19" tabindex="-1"></a>      <span className="at">barwidth =</span> <span className="fu">unit</span>(<span className="dv">7</span>, <span className="st">"pt"</span>),</span>
<span id="cb3-20"><a aria-hidden="true" href="#cb3-20" tabindex="-1"></a>      <span className="at">barheight =</span> <span className="fu">unit</span>(<span className="dv">220</span>, <span className="st">"pt"</span>),</span>
<span id="cb3-21"><a aria-hidden="true" href="#cb3-21" tabindex="-1"></a>      <span className="at">title =</span> <span className="cn">NULL</span>, </span>
<span id="cb3-22"><a aria-hidden="true" href="#cb3-22" tabindex="-1"></a>      <span className="at">title.theme =</span> <span className="fu">element_text</span>(<span className="at">hjust =</span> .<span className="dv">5</span>, <span className="at">face =</span> <span className="st">"bold"</span>))) <span className="sc">+</span></span>
<span id="cb3-23"><a aria-hidden="true" href="#cb3-23" tabindex="-1"></a>    <span className="co"># theme</span></span>
<span id="cb3-24"><a aria-hidden="true" href="#cb3-24" tabindex="-1"></a>    <span className="fu">theme_void</span>() <span className="sc">+</span></span>
<span id="cb3-25"><a aria-hidden="true" href="#cb3-25" tabindex="-1"></a>    <span className="fu">theme</span>(<span className="at">plot.margin =</span> <span className="fu">margin</span>(<span className="fu">rep</span>(<span className="dv">5</span>, <span className="dv">4</span>)),</span>
<span id="cb3-26"><a aria-hidden="true" href="#cb3-26" tabindex="-1"></a>          <span className="at">plot.title =</span> <span className="fu">element_text</span>(</span>
<span id="cb3-27"><a aria-hidden="true" href="#cb3-27" tabindex="-1"></a>            <span className="at">hjust =</span> .<span className="dv">5</span>, <span className="at">face =</span> <span className="st">"bold"</span>, <span className="at">size =</span> <span className="dv">18</span>, <span className="at">color =</span> <span className="st">"turquoise4"</span>))</span>
<span id="cb3-28"><a aria-hidden="true" href="#cb3-28" tabindex="-1"></a>&#125;</span>
<span id="cb3-29"><a aria-hidden="true" href="#cb3-29" tabindex="-1"></a></span><br/>
<span id="cb3-30"><a aria-hidden="true" href="#cb3-30" tabindex="-1"></a><span className="co"># plot with different logarithmic transformations</span></span>
<span id="cb3-31"><a aria-hidden="true" href="#cb3-31" tabindex="-1"></a>heat1 <span className="ot">&lt;-</span> afripop_df <span className="sc">%&gt;%</span> <span className="fu">heat</span>()</span>
<span id="cb3-32"><a aria-hidden="true" href="#cb3-32" tabindex="-1"></a>heat2 <span className="ot">&lt;-</span> afripop_df <span className="sc">%&gt;%</span> <span className="fu">heat</span>(<span className="at">transformation =</span> <span className="st">"log"</span>, <span className="at">title =</span> <span className="st">"log"</span>)</span>
<span id="cb3-33"><a aria-hidden="true" href="#cb3-33" tabindex="-1"></a>heat3 <span className="ot">&lt;-</span> afripop_df <span className="sc">%&gt;%</span> <span className="fu">heat</span>(<span className="at">transformation =</span> <span className="st">"pseudo_log"</span>, <span className="at">title =</span> <span className="st">"pseudo-log"</span>)</span>
<span id="cb3-34"><a aria-hidden="true" href="#cb3-34" tabindex="-1"></a>heat4 <span className="ot">&lt;-</span> afripop_df <span className="sc">%&gt;%</span> <span className="fu">heat</span>(<span className="at">transformation =</span> <span className="st">"log1p"</span>, <span className="at">title =</span> <span className="st">"log (1 + x )"</span>)</span>
<span id="cb3-35"><a aria-hidden="true" href="#cb3-35" tabindex="-1"></a></span><br/>
<span id="cb3-36"><a aria-hidden="true" href="#cb3-36" tabindex="-1"></a>cowplot<span className="sc">::</span><span className="fu">plot_grid</span>(heat1, heat2, heat3, heat4, <span className="at">nrow =</span> <span className="dv">2</span>)</span></code></pre></div>
</details>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics_blog/pseudo_log_logarithmic_transform.webp" />
    <img className="img-fluid figure-img" src="graphics_blog/pseudo_log_logarithmic_transform.png" data-fallback="graphics_blog/pseudo_log_logarithmic_transform.png" />
  </picture>

</div>
</div>
</div>
<ul>
<li><p><strong>Without data transformation:</strong> the map is completely blacked out. The very scattered large values (depicted in brighter colors) are overwhelmed and drowned out by the bulk of smaller numbers.</p></li>
<li><p><strong>Classic logarithmic transform:</strong> it nicely unveils a data pattern. Most places in Egypt, however, is shown in grey (not part of the color scale); as these places have population values of zero, which is not defined in logarithm, the data is treated as “missing values” (i.e., <code>-Inf</code>). In addition, the minimal fractional numbers (e.g., 0.0000001) in the dataset creates much negative transformed values, which occupy and waste a large range of color scale.</p></li>
<li><p><strong>Pseudo-log transform:</strong> it performs the classic logarithmic operation for large numbers, but gradually transitions to a more “linear” scale as the values approach zero. It generates an impressive heatmap with well defined data pattern, highlighting the most populous geographical sites, and the dire inhospitality of the vast Saharan desert in the map.</p></li>
<li><p><strong>log(1+x):</strong> a somewhat brutal yet effective practice to throw away small fractional numbers and zero by adding 1 to them. Both Pseudo-log and log(1+x) transforms lead to a similar transformed data scale starting from 0, and properly highlight the regions of the most populous regions (which are of the most interest).</p></li>
</ul>
</section>

<section className="level3" id="root_transform">
<h3 className="anchored" data-anchor-id="root_transform"><em>c) compare with root transform</em></h3>
<p>It is interesting to compare pseudo-log transformation with the commonly used root transformation. A root transform with a base larger than 1, e.g., square root and cubit root, is equivalent to an exponential transform with a power smaller than 1. It shrinks values bigger than 1, and reduces larger values more rapidly than smaller values. This squeezes values close to each other to fit into a shorter range, making them easier to visualize.</p>
<div className="cell" data-layout-align="center">
<details className="code-fold">
<summary>Show the code</summary>
<div className="sourceCode cell-code" id="cb4"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb4-1"><a aria-hidden="true" href="#cb4-1" tabindex="-1"></a><span className="co"># Create a function with root transformation (base specified by 'a')</span></span>
<span id="cb4-2"><a aria-hidden="true" href="#cb4-2" tabindex="-1"></a>r <span className="ot">&lt;-</span> <span className="cf">function</span>(a)&#123;</span>
<span id="cb4-3"><a aria-hidden="true" href="#cb4-3" tabindex="-1"></a>  <span className="co"># perform root transformation (exponential with fractional power)</span></span>
<span id="cb4-4"><a aria-hidden="true" href="#cb4-4" tabindex="-1"></a>  <span className="fu">heat</span>( afripop_df <span className="sc">%&gt;%</span> <span className="fu">mutate</span>(<span className="at">pop =</span> (pop)<span className="sc">^</span>(<span className="dv">1</span><span className="sc">/</span>a)) ) <span className="sc">+</span> </span>
<span id="cb4-5"><a aria-hidden="true" href="#cb4-5" tabindex="-1"></a>    <span className="co"># update the color scale</span></span>
<span id="cb4-6"><a aria-hidden="true" href="#cb4-6" tabindex="-1"></a>    <span className="fu">scale_fill_viridis_c</span>(</span>
<span id="cb4-7"><a aria-hidden="true" href="#cb4-7" tabindex="-1"></a>      <span className="at">option =</span> <span className="st">"B"</span>,</span>
<span id="cb4-8"><a aria-hidden="true" href="#cb4-8" tabindex="-1"></a>      <span className="at">breaks =</span> myBreaks <span className="sc">^</span> (<span className="dv">1</span><span className="sc">/</span>a), </span>
<span id="cb4-9"><a aria-hidden="true" href="#cb4-9" tabindex="-1"></a>      <span className="co"># reverse back to original data before transformation</span></span>
<span id="cb4-10"><a aria-hidden="true" href="#cb4-10" tabindex="-1"></a>      <span className="at">labels =</span> <span className="cf">function</span>(x) &#123;<span className="fu">round</span>((x<span className="sc">^</span>a), <span className="dv">4</span>)&#125; </span>
<span id="cb4-11"><a aria-hidden="true" href="#cb4-11" tabindex="-1"></a>    ) <span className="sc">+</span></span>
<span id="cb4-12"><a aria-hidden="true" href="#cb4-12" tabindex="-1"></a>    <span className="fu">ggtitle</span>(<span className="fu">paste</span>(<span className="st">"root base"</span>, a))</span>
<span id="cb4-13"><a aria-hidden="true" href="#cb4-13" tabindex="-1"></a>&#125;</span>
<span id="cb4-14"><a aria-hidden="true" href="#cb4-14" tabindex="-1"></a></span><br/>
<span id="cb4-15"><a aria-hidden="true" href="#cb4-15" tabindex="-1"></a>r3 <span className="ot">&lt;-</span> <span className="fu">r</span>(<span className="at">a =</span> <span className="dv">3</span>)</span>
<span id="cb4-16"><a aria-hidden="true" href="#cb4-16" tabindex="-1"></a>r5 <span className="ot">&lt;-</span> <span className="fu">r</span>(<span className="at">a =</span> <span className="dv">5</span>)</span>
<span id="cb4-17"><a aria-hidden="true" href="#cb4-17" tabindex="-1"></a>r7 <span className="ot">&lt;-</span> <span className="fu">r</span>(<span className="at">a =</span> <span className="dv">7</span>)</span>
<span id="cb4-18"><a aria-hidden="true" href="#cb4-18" tabindex="-1"></a>r10 <span className="ot">&lt;-</span> <span className="fu">r</span>(<span className="at">a =</span> <span className="dv">10</span>)</span>
<span id="cb4-19"><a aria-hidden="true" href="#cb4-19" tabindex="-1"></a></span><br/>
<span id="cb4-20"><a aria-hidden="true" href="#cb4-20" tabindex="-1"></a>cowplot<span className="sc">::</span><span className="fu">plot_grid</span>(r3, r5, r7, r10, <span className="at">nrow =</span> <span className="dv">2</span>)</span></code></pre></div>
</details>
<div className="cell-output-display">
<div className="quarto-figure quarto-figure-center">
  <picture>
    <source type="image/webp" srcset="https://s3.amazonaws.com/databrewer/media/graphics_blog/pseudo_log_cubit_root_transform.webp" />
    <img className="img-fluid figure-img" src="graphics_blog/pseudo_log_cubit_root_transform.png" data-fallback="graphics_blog/pseudo_log_cubit_root_transform.png" />
  </picture>

</div>
</div>
</div>
<p>The problem with root transformation is its opposite effect on values between 0 and 1, as it <em>enlarges</em> them, rather than decreasing them. This results in <strong>exaggerated difference between fractional numbers and zero</strong>. Because of this, minimal fractional population density in the Saharan area are unrealistically blown up. In addition, zeros in Egypt are left unaffected, and results in a sharp discontinuity between transformed fractional numbers and zero, mirrored as abrupt color transition between Egypt (blacked out) and elsewhere (especially under transform using a large root base, e.g., as in the 4th panel).</p>
</section>

<section className="level3" id="reference">
<h3 className="anchored" data-anchor-id="reference">Reference</h3>
<ol type="1">
<li><Link to="https://win-vector.com/2012/03/01/modeling-trick-the-signed-pseudo-logarithm/"><strong>Modeling trick: The signed pseudo logarithm.</strong> Mount, J. (2012, March 1). <em>Win Vector Blog</em>.</Link></li>
<li><Link to="https://www.cs.cmu.edu/afs/andrew/course/02/250/R/x86_64-redhat-linux-gnu-library/3.5/scales/html/pseudo_log_trans.html">R document on <code>pseudo_log_trans()</code>.</Link></li>
<li><Link to="https://scales.r-lib.org/reference/log_trans.html">R document on log transformations.</Link></li>
</ol>
</section>
</main>
</div>
</div>
)}