import React from 'react'; 
import {Link} from 'react-router-dom'; 
import {useRCustomEffect} from '../../../useCustomEffect'; 
export default function PivotWiderPart1(){
useRCustomEffect()
return ( <div>
<div className="page-columns page-rows-contents page-layout-article" id="quarto-content">
<main className="content" id="quarto-document-content">
<header className="quarto-title-block default" id="title-block-header">
<div className="quarto-title">
<h1 className="title">Spread Columns into Wider Dataset (1/3): <em>the basics of</em> <code>pivot_wider()</code></h1>
</div>
<div className="quarto-title-meta">
</div>
</header>
<p><code>pivot_wider()</code> (previously known as <Link to="https://tidyr.tidyverse.org/reference/spread.html"><code>spread()</code></Link>) is the opposite of <Link to="../pivot-longer-part1"><code>pivot_longer()</code></Link>: it makes a dataset wider by increasing the number of columns and decreasing the number of rows. It’s relatively rare to need <code>pivot_wider()</code> to make tidy data, but it’s often useful for creating summary tables for presentation, or data in a format needed by many other tools.</p>
<p>Below you’ll learn the basics of <code>pivot_wider()</code> and appreciate its application from the following three excellent examples.</p>
<p><strong>e.g.1</strong> <code>population</code> is a subset of data from the World Health Organization that records the annual population in countries from 1995 to 2013. The dataset is in a nice tidy structure.</p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb1"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb1-1"><a aria-hidden="true" href="#cb1-1" tabindex="-1"></a><span className="fu">library</span>(tidyr)</span>
<span id="cb1-2"><a aria-hidden="true" href="#cb1-2" tabindex="-1"></a><span className="fu">library</span>(dplyr)</span>
<span id="cb1-3"><a aria-hidden="true" href="#cb1-3" tabindex="-1"></a></span><br/>
<span id="cb1-4"><a aria-hidden="true" href="#cb1-4" tabindex="-1"></a>population</span></code></pre></div>
<div className="cell-output cell-output-stdout">
<pre className="demo-highlight sourceCode r rcss"><code className="sourceCode r"># A tibble: 4,060 × 3
<br/>   country      year population
<br/>   &lt;chr&gt;       &lt;dbl&gt;      &lt;dbl&gt;
<br/> 1 Afghanistan  1995   17586073
<br/> 2 Afghanistan  1996   18415307
<br/> 3 Afghanistan  1997   19021226
<br/> 4 Afghanistan  1998   19496836
<br/> 5 Afghanistan  1999   19987071
<br/> 6 Afghanistan  2000   20595360
<br/> 7 Afghanistan  2001   21347782
<br/> 8 Afghanistan  2002   22202806
<br/> 9 Afghanistan  2003   23116142
<br/>10 Afghanistan  2004   24018682
<br/># ℹ 4,050 more rows</code></pre>
</div>
</div>
<p>You can use <code>pivot_wider()</code> to reshape the dataset into a wider format to make it easier for visual check and additional analysis with other tools. The first argument is the dataset to tidy up. Besides, there are another two basic arguments:</p>
<ul>
<li><p><code>names_from</code> specifies the column whose unique values will become the new column names in the pivoted wide-format data frame.</p></li>
<li><p><code>values_from</code> specifies the column whose values will populate the cells of the resulted wide-format data frame.</p></li>
</ul>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb3"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb3-1"><a aria-hidden="true" href="#cb3-1" tabindex="-1"></a>population <span className="sc">%&gt;%</span> <span className="fu">pivot_wider</span>(</span>
<span id="cb3-2"><a aria-hidden="true" href="#cb3-2" tabindex="-1"></a>  <span className="co"># unique values in the 'year' column are spread out as new column names</span></span>
<span id="cb3-3"><a aria-hidden="true" href="#cb3-3" tabindex="-1"></a>  <span className="at">names_from =</span> year,  </span>
<span id="cb3-4"><a aria-hidden="true" href="#cb3-4" tabindex="-1"></a>  <span className="co"># values in the 'population' column are used to fill up cells of new columns</span></span>
<span id="cb3-5"><a aria-hidden="true" href="#cb3-5" tabindex="-1"></a>  <span className="at">values_from =</span> population</span>
<span id="cb3-6"><a aria-hidden="true" href="#cb3-6" tabindex="-1"></a>)</span></code></pre></div>
<div className="cell-output cell-output-stdout">
<pre className="demo-highlight sourceCode r rcss"><code className="sourceCode r"># A tibble: 219 × 20
<br/>   country               `1995`   `1996`   `1997`   `1998`   `1999`   `2000`   `2001`   `2002`   `2003`   `2004`   `2005`   `2006`   `2007`   `2008`   `2009`   `2010`   `2011`   `2012`   `2013`
<br/>   &lt;chr&gt;                  &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;
<br/> 1 Afghanistan         17586073 18415307 19021226 19496836 19987071 20595360 21347782 22202806 23116142 24018682 24860855 25631282 26349243 27032197 27708187 28397812 29105480 29824536 30551674
<br/> 2 Albania              3357858  3341043  3331317  3325456  3317941  3304948  3286084  3263596  3239385  3216197  3196130  3179573  3166222  3156608  3151185  3150143  3153883  3162083  3173271
<br/> 3 Algeria             29315463 29845208 30345466 30820435 31276295 31719449 32150198 32572977 33003442 33461345 33960903 34507214 35097043 35725377 36383302 37062820 37762962 38481705 39208194
<br/> 4 American Samoa         52874    53926    54942    55899    56768    57522    58176    58729    59117    59262    59117    58652    57919    57053    56245    55636    55274    55128    55165
<br/> 5 Andorra                63854    64274    64090    63799    64084    65399    68000    71639    75643    79060    81223    81877    81292    79969    78659    77907    77865    78360    79218
<br/> 6 Angola              12104952 12451945 12791388 13137542 13510616 13924930 14385283 14886574 15421075 15976715 16544376 17122409 17712824 18314441 18926650 19549124 20180490 20820525 21471618
<br/> 7 Anguilla                9807    10063    10305    10545    10797    11071    11371    11693    12023    12342    12637    12903    13145    13365    13571    13768    13956    14132    14300
<br/> 8 Antigua and Barbuda    68349    70245    72232    74206    76041    77648    78972    80030    80904    81718    82565    83467    84397    85349    86300    87233    88152    89069    89985
<br/> 9 Argentina           34833168 35264070 35690778 36109342 36514558 36903067 37273361 37627545 37970411 38308779 38647854 38988923 39331357 39676083 40023641 40374224 40728738 41086927 41446246
<br/>10 Armenia              3223173  3173425  3137652  3112958  3093820  3076098  3059960  3047002  3036032  3025652  3014917  3002911  2989882  2977488  2968154  2963496  2964120  2969081  2976566
<br/># ℹ 209 more rows</code></pre>
</div>
</div>
<p><br/></p>
<p><strong>e.g.2</strong> When a tagged fish swims downstream in a river, each instance of its detection by an autonomous monitor (referred to as a <code>station</code>) is encoded as 1. The <code>fish_encounters</code> dataset records such an encounters history.</p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb5"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb5-1"><a aria-hidden="true" href="#cb5-1" tabindex="-1"></a>fish_encounters</span></code></pre></div>
<div className="cell-output cell-output-stdout">
<pre className="demo-highlight sourceCode r rcss"><code className="sourceCode r"># A tibble: 114 × 3
<br/>   fish  station  seen
<br/>   &lt;fct&gt; &lt;fct&gt;   &lt;int&gt;
<br/> 1 4842  Release     1
<br/> 2 4842  I80_1       1
<br/> 3 4842  Lisbon      1
<br/> 4 4842  Rstr        1
<br/> 5 4842  Base_TD     1
<br/> 6 4842  BCE         1
<br/> 7 4842  BCW         1
<br/> 8 4842  BCE2        1
<br/> 9 4842  BCW2        1
<br/>10 4842  MAE         1
<br/># ℹ 104 more rows</code></pre>
</div>
</div>
<p>Below we’ll pivot the dataset into a wider format. This quickly generates a bird view of detection instances of the tagged fish by different stations.</p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb7"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb7-1"><a aria-hidden="true" href="#cb7-1" tabindex="-1"></a>fish_encounters <span className="sc">%&gt;%</span> </span>
<span id="cb7-2"><a aria-hidden="true" href="#cb7-2" tabindex="-1"></a>  <span className="fu">pivot_wider</span>(</span>
<span id="cb7-3"><a aria-hidden="true" href="#cb7-3" tabindex="-1"></a>    <span className="co"># unique values in the 'station' column are spread out as new column names</span></span>
<span id="cb7-4"><a aria-hidden="true" href="#cb7-4" tabindex="-1"></a>    <span className="at">names_from =</span> station, </span>
<span id="cb7-5"><a aria-hidden="true" href="#cb7-5" tabindex="-1"></a>    <span className="co"># values in the 'seen' column are used to fill up cells of new columns</span></span>
<span id="cb7-6"><a aria-hidden="true" href="#cb7-6" tabindex="-1"></a>    <span className="at">values_from =</span> seen)</span></code></pre></div>
<div className="cell-output cell-output-stdout">
<pre className="demo-highlight sourceCode r rcss"><code className="sourceCode r"># A tibble: 19 × 12
<br/>   fish  Release I80_1 Lisbon  Rstr Base_TD   BCE   BCW  BCE2  BCW2   MAE   MAW
<br/>   &lt;fct&gt;   &lt;int&gt; &lt;int&gt;  &lt;int&gt; &lt;int&gt;   &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;
<br/> 1 4842        1     1      1     1       1     1     1     1     1     1     1
<br/> 2 4843        1     1      1     1       1     1     1     1     1     1     1
<br/> 3 4844        1     1      1     1       1     1     1     1     1     1     1
<br/> 4 4845        1     1      1     1       1    NA    NA    NA    NA    NA    NA
<br/> 5 4847        1     1      1    NA      NA    NA    NA    NA    NA    NA    NA
<br/> 6 4848        1     1      1     1      NA    NA    NA    NA    NA    NA    NA
<br/> 7 4849        1     1     NA    NA      NA    NA    NA    NA    NA    NA    NA
<br/> 8 4850        1     1     NA     1       1     1     1    NA    NA    NA    NA
<br/> 9 4851        1     1     NA    NA      NA    NA    NA    NA    NA    NA    NA
<br/>10 4854        1     1     NA    NA      NA    NA    NA    NA    NA    NA    NA
<br/>11 4855        1     1      1     1       1    NA    NA    NA    NA    NA    NA
<br/>12 4857        1     1      1     1       1     1     1     1     1    NA    NA
<br/>13 4858        1     1      1     1       1     1     1     1     1     1     1
<br/>14 4859        1     1      1     1       1    NA    NA    NA    NA    NA    NA
<br/>15 4861        1     1      1     1       1     1     1     1     1     1     1
<br/>16 4862        1     1      1     1       1     1     1     1     1    NA    NA
<br/>17 4863        1     1     NA    NA      NA    NA    NA    NA    NA    NA    NA
<br/>18 4864        1     1     NA    NA      NA    NA    NA    NA    NA    NA    NA
<br/>19 4865        1     1      1    NA      NA    NA    NA    NA    NA    NA    NA</code></pre>
</div>
</div>
<p>A registry of <code>1</code> is recorded by a station only when a fish was detected; nothing was recorded when the fish was not detected, resulting in <code>NA</code> values. As such, we can ask <code>pivot_wider()</code> to fill in these missing values with zeros using the <code>values_fill</code> argument.</p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb9"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb9-1"><a aria-hidden="true" href="#cb9-1" tabindex="-1"></a>fish_encounters <span className="sc">%&gt;%</span> <span className="fu">pivot_wider</span>(</span>
<span id="cb9-2"><a aria-hidden="true" href="#cb9-2" tabindex="-1"></a>  <span className="at">names_from =</span> station, </span>
<span id="cb9-3"><a aria-hidden="true" href="#cb9-3" tabindex="-1"></a>  <span className="at">values_from =</span> seen,</span>
<span id="cb9-4"><a aria-hidden="true" href="#cb9-4" tabindex="-1"></a>  <span className="at">values_fill =</span> <span className="dv">0</span> <span className="co"># use 0 in place of NA values</span></span>
<span id="cb9-5"><a aria-hidden="true" href="#cb9-5" tabindex="-1"></a>)</span></code></pre></div>
<div className="cell-output cell-output-stdout">
<pre className="demo-highlight sourceCode r rcss"><code className="sourceCode r"># A tibble: 19 × 12
<br/>   fish  Release I80_1 Lisbon  Rstr Base_TD   BCE   BCW  BCE2  BCW2   MAE   MAW
<br/>   &lt;fct&gt;   &lt;int&gt; &lt;int&gt;  &lt;int&gt; &lt;int&gt;   &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;
<br/> 1 4842        1     1      1     1       1     1     1     1     1     1     1
<br/> 2 4843        1     1      1     1       1     1     1     1     1     1     1
<br/> 3 4844        1     1      1     1       1     1     1     1     1     1     1
<br/> 4 4845        1     1      1     1       1     0     0     0     0     0     0
<br/> 5 4847        1     1      1     0       0     0     0     0     0     0     0
<br/> 6 4848        1     1      1     1       0     0     0     0     0     0     0
<br/> 7 4849        1     1      0     0       0     0     0     0     0     0     0
<br/> 8 4850        1     1      0     1       1     1     1     0     0     0     0
<br/> 9 4851        1     1      0     0       0     0     0     0     0     0     0
<br/>10 4854        1     1      0     0       0     0     0     0     0     0     0
<br/>11 4855        1     1      1     1       1     0     0     0     0     0     0
<br/>12 4857        1     1      1     1       1     1     1     1     1     0     0
<br/>13 4858        1     1      1     1       1     1     1     1     1     1     1
<br/>14 4859        1     1      1     1       1     0     0     0     0     0     0
<br/>15 4861        1     1      1     1       1     1     1     1     1     1     1
<br/>16 4862        1     1      1     1       1     1     1     1     1     0     0
<br/>17 4863        1     1      0     0       0     0     0     0     0     0     0
<br/>18 4864        1     1      0     0       0     0     0     0     0     0     0
<br/>19 4865        1     1      1     0       0     0     0     0     0     0     0</code></pre>
</div>
</div>
<p><br/></p>
<p><strong>e.g.3</strong> Imagine you have a contact list that you’ve copied and pasted from a website:</p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb11"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb11-1"><a aria-hidden="true" href="#cb11-1" tabindex="-1"></a>contacts <span className="ot">&lt;-</span> <span className="fu">tribble</span>(</span>
<span id="cb11-2"><a aria-hidden="true" href="#cb11-2" tabindex="-1"></a>  <span className="sc">~</span>field, <span className="sc">~</span>value,</span>
<span id="cb11-3"><a aria-hidden="true" href="#cb11-3" tabindex="-1"></a>  <span className="st">"name"</span>, <span className="st">"Jiena McLellan"</span>,</span>
<span id="cb11-4"><a aria-hidden="true" href="#cb11-4" tabindex="-1"></a>  <span className="st">"company"</span>, <span className="st">"Toyota"</span>, </span>
<span id="cb11-5"><a aria-hidden="true" href="#cb11-5" tabindex="-1"></a>  <span className="st">"name"</span>, <span className="st">"John Smith"</span>, </span>
<span id="cb11-6"><a aria-hidden="true" href="#cb11-6" tabindex="-1"></a>  <span className="st">"company"</span>, <span className="st">"google"</span>, </span>
<span id="cb11-7"><a aria-hidden="true" href="#cb11-7" tabindex="-1"></a>  <span className="st">"email"</span>, <span className="st">"john@google.com"</span>,</span>
<span id="cb11-8"><a aria-hidden="true" href="#cb11-8" tabindex="-1"></a>  <span className="st">"name"</span>, <span className="st">"Huxley Ratcliffe"</span></span>
<span id="cb11-9"><a aria-hidden="true" href="#cb11-9" tabindex="-1"></a>)</span>
<span id="cb11-10"><a aria-hidden="true" href="#cb11-10" tabindex="-1"></a></span><br/>
<span id="cb11-11"><a aria-hidden="true" href="#cb11-11" tabindex="-1"></a>contacts</span></code></pre></div>
<div className="cell-output cell-output-stdout">
<pre className="demo-highlight sourceCode r rcss"><code className="sourceCode r"># A tibble: 6 × 2
<br/>  field   value           
<br/>  &lt;chr&gt;   &lt;chr&gt;           
<br/>1 name    Jiena McLellan  
<br/>2 company Toyota          
<br/>3 name    John Smith      
<br/>4 company google          
<br/>5 email   john@google.com 
<br/>6 name    Huxley Ratcliffe</code></pre>
</div>
</div>
<p>This is challenging because there’s no variable that identifies which observations belong together. We can fix this by noting that every contact starts with a name, so we can create a unique id by counting every time we see <code>name</code> as the field.</p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb13"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb13-1"><a aria-hidden="true" href="#cb13-1" tabindex="-1"></a>contacts <span className="ot">&lt;-</span> contacts <span className="sc">%&gt;%</span> </span>
<span id="cb13-2"><a aria-hidden="true" href="#cb13-2" tabindex="-1"></a>  <span className="fu">mutate</span>(<span className="at">person_id =</span> <span className="fu">cumsum</span>(field <span className="sc">==</span> <span className="st">"name"</span>))</span>
<span id="cb13-3"><a aria-hidden="true" href="#cb13-3" tabindex="-1"></a></span><br/>
<span id="cb13-4"><a aria-hidden="true" href="#cb13-4" tabindex="-1"></a>contacts</span></code></pre></div>
<div className="cell-output cell-output-stdout">
<pre className="demo-highlight sourceCode r rcss"><code className="sourceCode r"># A tibble: 6 × 3
<br/>  field   value            person_id
<br/>  &lt;chr&gt;   &lt;chr&gt;                &lt;int&gt;
<br/>1 name    Jiena McLellan           1
<br/>2 company Toyota                   1
<br/>3 name    John Smith               2
<br/>4 company google                   2
<br/>5 email   john@google.com          2
<br/>6 name    Huxley Ratcliffe         3</code></pre>
</div>
</div>
<p>Now that we have a unique identifier for each person, we can pivot <code>field</code> and <code>value</code> into the columns.</p>
<div className="cell" data-layout-align="center">
<div className="sourceCode cell-code" id="cb15"><pre className="sourceCode r code-with-copy"><code className="sourceCode r"><span id="cb15-1"><a aria-hidden="true" href="#cb15-1" tabindex="-1"></a>contacts <span className="sc">%&gt;%</span> </span>
<span id="cb15-2"><a aria-hidden="true" href="#cb15-2" tabindex="-1"></a>  <span className="fu">pivot_wider</span>(<span className="at">names_from =</span> field, <span className="at">values_from =</span> value)</span></code></pre></div>
<div className="cell-output cell-output-stdout">
<pre className="demo-highlight sourceCode r rcss"><code className="sourceCode r"># A tibble: 3 × 4
<br/>  person_id name             company email          
<br/>      &lt;int&gt; &lt;chr&gt;            &lt;chr&gt;   &lt;chr&gt;          
<br/>1         1 Jiena McLellan   Toyota  &lt;NA&gt;           
<br/>2         2 John Smith       google  john@google.com
<br/>3         3 Huxley Ratcliffe &lt;NA&gt;    &lt;NA&gt;           </code></pre>
</div>
</div>
<p><br/></p>
<p>Now you have been familiar with the basic use of <code>pivot_wider()</code>. The following two tutorials on <code>pivot_wider()</code> will discuss more advanced features that allow you to efficiently pivot datasets with increasingly complex structure.</p>
</main>
</div>
</div>
)}