{"id":10579,"date":"2022-03-09T09:00:00","date_gmt":"2022-03-09T14:00:00","guid":{"rendered":"https:\/\/dvsnightingstg.wpenginepowered.com\/?p=10579"},"modified":"2022-03-08T21:23:18","modified_gmt":"2022-03-09T02:23:18","slug":"data-exploration-step-1-getting-to-know-your-data","status":"publish","type":"post","link":"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/","title":{"rendered":"Step 1 in the Data Exploration Journey: Getting Oriented"},"content":{"rendered":"\n<p class=\"is-style-lead\">This article is part II in a series on data exploration, and the common struggles that we all face when trying to learn something new. The previous article can be found <a href=\"https:\/\/dvsnightingale.wpenginepowered.com\/embrace-the-challenge-to-beat-imposter-syndrome\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>. I\u2019ll be using the tools data from the State of the Industry Survey as a basis for this exploration, to illustrate both how I approach a new project, and the fact that no \u201cexpert\u201d is immune from the challenges and setbacks of learning. In addition to working with a new dataset, I am also using this project to take my first steps toward learning R. Let\u2019s see where this journey takes us!<\/p>\n\n\n\n<p class=\"has-drop-cap\">Before diving into a project where I\u2019m likely to get diverted or distracted, it\u2019s helpful to take a moment to get a clear idea of what I\u2019m working toward. A design brief helps to clarify my thoughts and it gives me a reference point to check against as I evaluate tangents and new opportunities that come up during the exploration phase. It can also be helpful as a framework for evaluating success, and as a way to structure feedback and evaluate input from other people. I don\u2019t always write this down, but I find it helpful to have clear objectives from the start.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Design brief:&nbsp;<\/h2>\n\n\n\n<p>Here\u2019s a rough outline of what I\u2019m trying to achieve in this project.&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Project Scope:<\/h3>\n\n\n\n<ul class=\"wp-block-list\"><li>Design a chart (or group of charts) to showcase a single 2020 survey question about tool usage, and to provide insights into the skills needed in different careers.&nbsp;<\/li><li>For now, I will focus on the tools question exclusively, though there are many interesting questions correlating this information to other survey questions that might be worth considering in the future.<\/li><li>This is an exploratory project, so the final output and mechanism of delivery are to be determined.<\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/dvsnightingale.wpenginepowered.com\/an-insight-informed-view-of-dvs-education-opportunities\/\" target=\"_blank\" rel=\"noreferrer noopener\">Context<\/a>:<\/h3>\n\n\n\n<ul class=\"wp-block-list\"><li>This visualization is part of a larger project to leverage the DVS survey data to inform people about different career paths.<\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/dvsnightingale.wpenginepowered.com\/who-is-your-chart-for\/\" target=\"_blank\" rel=\"noreferrer noopener\">Audience<\/a>:&nbsp;<\/h3>\n\n\n\n<ul class=\"wp-block-list\"><li>DVS members and <em>Nightingale<\/em> readers.<\/li><li>People working in dataviz who are curious to see what tools others use.<\/li><li>People new to dataviz who want to understand what they should learn next (especially if they are interested in a particular career, or comparing different careers to find a match for their interests).<\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/dvsnightingale.wpenginepowered.com\/dashboard-redesign-understanding-purpose\/\" target=\"_blank\" rel=\"noreferrer noopener\">Purpose<\/a>:<\/h3>\n\n\n\n<ul class=\"wp-block-list\"><li>To understand which tools are most popular, and which sets of tools tend to be used together.<\/li><li>To find out how much tool usage varies between professional communities.<\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/www.datavisualizationsociety.org\/survey\" target=\"_blank\" rel=\"noreferrer noopener\">Data<\/a>:&nbsp;<\/h3>\n\n\n\n<ul class=\"wp-block-list\"><li>Data from the 2020 DVS census, hand-tagged to different career groups by job title.<\/li><li>The first four career categories account for roughly \u2154 of the data. Business analysts and related roles are the largest group, at almost 36 percent.<\/li><li>1,766 individual data points, with 33 tools (plus an \u201cother\u201d category) listed in the dataset.<\/li><li>This survey question is not exclusive; respondents can choose more than one answer.<\/li><li>I am not doing anything to remove or track incomplete responses at this point.&nbsp;<\/li><li>The \u201cother\u201d category in the original question is excluded from this analysis, since it is a free-entry field and harder to process (a future version should include this data as well).<\/li><\/ul>\n\n\n\n<p>Once I know what I\u2019m trying to do, the next step is the fun part: getting in there and understanding what this problem really looks like. This is the \u201c<a href=\"https:\/\/dvsnightingale.wpenginepowered.com\/embrace-the-challenge-to-beat-imposter-syndrome\/\" target=\"_blank\" rel=\"noreferrer noopener\">expand<\/a>\u201d stage, where I take some time to get oriented, understand what the data is all about, and play around with some initial ideas. This stage is especially critical when you are working with someone else\u2019s data. You want to find all of the gotchas and limitations lurking in the data before spending too much time on a formal analysis. I&#8217;m also looking to assemble a strategic view of where I&#8217;m going, so that I know where to spend my time in the much slower, higher-effort &#8220;focus\/consolidate&#8221; step that comes next.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Practices for exploration<\/h2>\n\n\n\n<p>For me, the most important principles of the exploration piece of the expand phase are the following:&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Do what\u2019s easy<\/h3>\n\n\n\n<p>I\u2019m trying to learn as much as possible about the problem without getting too bogged down in the details or impeded by my tools. Right now, I just want to rummage through as many aspects of the data as I can, to identify which ones might be promising enough to come back to later, and to anticipate the dead ends.<\/p>\n\n\n\n<p>What that looks like for me right now is a tiny bit of analysis in R to export a spreadsheet, and then a whole lot of manual playing around in Excel to figure out how I want the bits to work together. I&#8217;d do that in R if I had the skills, but I don\u2019t (yet\u2026that\u2019s what I\u2019m learning!). For now, I\u2019ll use R when I can, and when I can&#8217;t, it&#8217;s good old elbow grease in Excel to fill in the gaps. I\u2019m leaving the hard work of learning new software for later, when I have a better idea of what I need, and a better sense of the data to help me sense-check my results. The price for that will be hours spent doing things inefficiently in Excel, but the tradeoff is worth it to me right now, especially because it\u2019s easier for me to check my work and debug data issues in software that I know well.<\/p>\n\n\n\n<p>I&#8217;m using Illustrator for the charts for several reasons. I&#8217;m already familiar with it, I want the flexibility to sketch and ideate on top of basic data points and visual forms, and the actual data values aren&#8217;t all that important to me right now. The data points will all need to be carefully recalculated and analyzed for the final version anyway, so everything in this file is subject to change, and should be thrown out. Knowing that gives me the freedom to ignore \u201clittle\u201d things like axis labels, and just dump in screenshots and notes to help myself re-connect the dots later, instead of spending days building out charts that I know I will throw away. Is this best practice? No. Is it ok when you\u2019re sketching? In my opinion, yes \u2013 if you know how to structure your notes and your process so that you can pick up the pieces later.<\/p>\n\n\n\n<p>One important caveat: this approach works well in a case like this one, where I am doing multiple straightforward analyses off of the same dataset (lots of pivots, but using only two source data tables, and few calculations). I am not building a big, complicated analysis where each step depends on the result of the last. If you have a sequence of analyses that affect one another, it is almost always better to build out and thoroughly test each one before moving on to the next. You might still do some sketching and ideation to make sure you understand the paths and the steps that you need to take, but you should always be more careful when working with code or analyses that have strong dependencies.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Sketch first<\/h3>\n\n\n\n<p>You may be thinking that this is a sloppy, imprecise way to work\u2026and you would be right! To me, that&#8217;s actually sort of the point here. I don&#8217;t want to get in too deep and start taking myself seriously before I know what I&#8217;m after and where I&#8217;m going. In my experience, an analysis that looks like it might be finished is a lot more dangerous than one that is clearly a mess, because it&#8217;s easy to forget that \u201cone little thing\u201d you needed to do when you came back. I used to tell my students that the best way to avoid plagiarizing was to never copy and paste (or even paraphrase) someone else&#8217;s sentence into your document. Once it&#8217;s in there, it&#8217;s really easy to forget that you need to go back and make a change, but a big block of [add something interesting here later] with a link to your references file is something you&#8217;re not likely to miss in the editing phase.<\/p>\n\n\n\n<p>I find that the same thing applies to charts. If I make a chart that looks &#8220;real&#8221; in Excel and I skip a step in the data analysis for the sake of time, I&#8217;m much more likely to end up with an error in my final dataset. I consciously prevent that by increasing the separation between the ideation and editing stage (different tools, different files, etc.). This helps me to avoid getting bogged down in the details too early, short-circuits perfectionism, and gives me the room to move freely while I work through the big-picture strategy for a project. Sometimes, it also means that I make mistakes, but I can usually live with that. This approach <em>only works<\/em> if you have the discipline to really, actually start over from scratch and to resist the urge to copy and paste later. Otherwise, you risk transferring errors and missing gaps that could jeopardize your entire analysis.<\/p>\n\n\n\n<p>Personally, I have usually learned enough by the end of the sketching to more than make up for the time that I\u2019ve lost making messy charts that need to be thrown out. You may find that it\u2019s different for you; in that case, it might be better to stick to pen and paper for sketches, because that\u2019s almost always the fastest way. I prefer the additional detail of realistic data values, but depending on the project, it\u2019s not always necessary, or worth the time. In some cases, my values are so exact that I have to do the full analysis, at high quality, right from the beginning. That\u2019s okay, too. It\u2019s just a matter of deciding what makes sense for you, and for the project at hand.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Leave a trail<\/h3>\n\n\n\n<p>This kind of rapid ideation also means that I need to leave myself a trail to make sure that I can come back and re-create each individual step. This practice helps me to make first-draft documentation for the analysis and the project. If I know that I&#8217;m going to have to go back and figure out those cryptic notes later, it creates a good incentive not to cut corners on writing things down. I usually just keep a running Word document with a bullet list of changes for each day, notes about file and tab names and locations, and a bunch of screenshots showing different iterations. Use whatever works for you. Writing blog posts is also a really good way to document what you&#8217;re doing at a high level, to help make sense of the details in your implementation notes.&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Start simple<\/h3>\n\n\n\n<p>Sometimes it\u2019s hard to resist the urge to dive right in on the newest, most interesting thing, but I find that it\u2019s much better to slow down and look around me first. Always start simple and work up from there, especially when you\u2019re working with someone else\u2019s data. I am 99 percent sure that I am going to want some kind of advanced cluster analysis by the time I\u2019m done with this project, but it would be a mistake to get impatient and go for that destination right away.&nbsp;<\/p>\n\n\n\n<p>First, I don\u2019t know what I\u2019m doing yet with the R software. I\u2019m much more likely to end up bogged down in questions I can\u2019t answer, frustrated by my technology, and missing out on the actual insight if I jump in right away. Even if I could find a package to run the analysis automatically, I\u2019d have to blindly trust the output at this stage, and that can be dangerous. I never trust an automated routine without first taking the time to understand the method and its limitations, becoming familiar with the data I\u2019m putting in, and getting at least some sense of what I should expect to get out. Otherwise, it\u2019s just a black box and I have no way to evaluate the results.&nbsp;<\/p>\n\n\n\n<p>Second, by jumping in directly I\u2019d miss the opportunity to develop a deeper understanding of the dataset that will help to inform my interpretation of the results. Third, there are tons of other insights sitting right in front of me, just waiting to pop out. If I short circuit the exploration stage in favor of a fancy analysis, I may end up missing the most important thing that this data has to tell me. Like any good warmup routine, a robust exploration phase helps to make sure you\u2019re ready, improves your performance, and prevents injury (mistakes\/frustration) when you get into the actual analysis.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Follow the data<\/h3>\n\n\n\n<p>My whole job in this part of an analysis is to understand what\u2019s going on with the data. I want to get a big picture sense of counts and distributions, and to see where there is variation between careers. I usually start with the simplest possible question that I can ask of the data, and then work forward from there. I\u2019m not trying to force a particular path or get to a specific outcome yet. Right now, I just want to see what questions come up as I look at the dataset.&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Ask more questions<\/h3>\n\n\n\n<p>The wonderful thing about questions is that answering one of them almost always creates another. Follow your curiosity and see where it leads. Looking at one chart will usually suggest another idea. As I get into more depth, I keep a running list of more complex questions that I want to come back to and explore. I also structure my output document to reflect the series of questions I asked, so that I can come back later and follow the trail.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">A first look at the data:<\/h2>\n\n\n\n<p>Here are some of the questions I asked in the first couple of weeks of exploring this data:&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">How common are the different tools? Which tool is most popular?<\/h3>\n\n\n\n<p>My first chart was just a simple frequency calculation for the different tools, plugged into the most basic, default chart possible, to help me see the data values. Adding a simple sort function lets me see which tools top the list.&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img decoding=\"async\" src=\"https:\/\/lh3.googleusercontent.com\/K3J7bYxKCDLTUs__oCE09ux2CrnEYkT7QDfsHJml0pDPsJXQzWomwthovryUBfY6mdcq1Y1DlUrL2AtDwEp1k2bG5PslIIDFOz5zV_V8pletnzOAzoyg0nU7uR7WKkqJzs7fqT_-\" alt=\"\"\/><\/figure><\/div>\n\n\n\n<p>It\u2019s important to note that I\u2019m not looking at specific values right now, because I know my analysis is not robust enough to support that kind of weight. Doing this analysis in Excel required dragging each and every tool into my pivot table by hand; there were 35 of them, and they needed to be added in order, using the same order every time. I decided that I didn\u2019t need that level of detail in this stage of the project. That means that my ranked values are only based on half of the dataset, so I really know nothing about which is the most common tool from this chart. (In fact, this chart is missing Tableau, which is actually the second-most common tool!)&nbsp;<\/p>\n\n\n\n<p>It&#8217;s also important to note that the total number of answers in the bars adds up to significantly more data points than people who took the survey, because this survey question allows one person to select multiple answers. I\u2019m not yet doing anything about that multiselect or even tracking its meaning, but that\u2019s worth adding to the list of items that I need to keep in mind as I go.&nbsp;<\/p>\n\n\n\n<p>It\u2019s fine to work with incomplete data right now, as long as I resist the urge to try to draw conclusions or make inferences off of the differences that I see in these charts. I know that I am missing important information in the points that I\u2019m not showing, and that those values could change everything about the conclusions that I\u2019d draw based on the numbers that I see here. Again, I\u2019m just trying to get a sense of the dataset, and I care more about the structure and the kinds of analyses right now than I do about the individual values. Getting too attached to individual values and conclusions can actually be counterproductive at this stage.&nbsp;<\/p>\n\n\n\n<p>This is another reason that I chose not to bother creating and formatting axis labels in my Illustrator document: it is an extra manual step in the software, but it\u2019s also a salient reminder that I can\u2019t count on any of this information to be real. That level of detail comes later, when I\u2019m ready to do this thing right.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">How much does the distribution of tools vary across careers?<\/h3>\n\n\n\n<p>Next, I built a copy of my basic chart for each of the different career paths. The first column shows the total for all career paths, and the subsequent charts show distributions for each of the subgroups.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img decoding=\"async\" src=\"https:\/\/lh3.googleusercontent.com\/KE5-lLDdUJuTmwXjNetksnXGUvnZPCBAQfkfG0OHqDu0fcpAxfRXUsdckedZB7Q_YssK1wyAOohqsaXZYj10kmAb_k5h4Bl_2ru5BiRQBYIirSU_IB1jEA7YBEms6jVXLYjyKlua\" alt=\"\"\/><\/figure><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Which tool is the most popular, within each career group?&nbsp;<\/h3>\n\n\n\n<p>Again, I can sort my bars by height, to get a sense for the popularity of individual tools. The previous charts all used the same sort as the totals chart, so that I could compare positions across career groups (if Excel was at the top of the totals list, it is the top bar within each career chart). If I want to look at popularity within groups, I can re-sort my y axis as I create each individual chart. I have to be careful here, because this means that my y axis is now different for each chart, which makes my omission of data from these visualizations riskier. Again, I\u2019m deliberately not fixating on data values or patterns yet, and knowing that I\u2019m on thin ice with the analysis helps to keep me out of inference-making mode and in the exploration space.&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img decoding=\"async\" src=\"https:\/\/lh6.googleusercontent.com\/ggFsWkALUWWtBBR9bLG4GVegMyxbr7XS5qJRkjmDQjgQE1KXAYIVEAe9iAhA0kPcqmyfywUh_HvDVWPnTPmWW5Yufq__AAEStlPFXcbHNaq2aZto1jj7XdFRI0zG23O3Q1jE5gIQ\" alt=\"\"\/><\/figure><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">How many counts do we have for the different career-tool groups?<\/h3>\n\n\n\n<p>Another reason to be cautious is that I know from a previous analysis that the size of my career groups is not evenly distributed within the dataset. The charts above are scaled automatically to the max for each dataset, which hides that variation. The second row of charts below shows the first row scaled to a common x axis, set based on the sorted and unsorted totals charts at the far left of the row. Right away, I have a different sense of which differences are meaningful to follow up on; this completely shifts my interpretation of the previous charts. I want to find that out now, before I get attached to a story that doesn\u2019t exist.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img decoding=\"async\" src=\"https:\/\/lh4.googleusercontent.com\/TpwUHkxN5yuSSgk-jxbIESRB1SvUTT9q1hUKHlfeTGt0H7dAna9IpTtFPk3oadHVIW3IFFIySOwvENQMVuuXgqNkrPi-vxmciFV15YxHz29XjYNnt5Sd3q57oa2ITKW4cFgC2IEH\" alt=\"\"\/><\/figure><\/div>\n\n\n\n<p>I often use this sort of small multiples approach to help keep me honest when looking for interesting differences, patterns, and trends in aggregated data. If you forget about the underlying counts, you\u2019ll often end up chasing differences that vanish in the final analysis, or drawing conclusions that the data can\u2019t support.&nbsp;<\/p>\n\n\n\n<p>Again, I\u2019m not looking at values here, but what I <em>can<\/em> see is that there is a fairly long tail for most career groups, and that the shape of the distribution is similar across groups. There are a couple of careers with a shoulder, or with a more abrupt increase in counts for the top few bars, but the careers with the largest differences in distribution widths seem to be the ones with the lowest counts. That makes me wonder if the width of the distribution reflects the number of tools chosen by individuals, or whether it\u2019s actually driven by variation between individuals within each group (everybody picks five tools, but no two people pick the same tool). I can\u2019t answer that question from here just yet, but I\u2019ll put it aside to dig into later, when I\u2019m ready for more sophisticated comparisons.&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">How much does the top tool vary across career groups?<\/h3>\n\n\n\n<p>The sorted bar charts can give me some sense of distribution, but they do a terrible job of helping me to track tool position from one career group to the next. Putting the same data into a different visual form makes that task a lot easier.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img decoding=\"async\" src=\"https:\/\/lh6.googleusercontent.com\/NaoXtCyqf7poKTikdGj2fdC2lSEzJXu2BWn-LBtX9Y2CWo430yMQMsdyTsMjK_KjzzhqxVqkfm5QctyNjOjezjpmvC7-q-M8aj69mbBX_BLEBljPfjy9M1X9N2F0N5Ta_v9tKLWk\" alt=\"\"\/><\/figure><\/div>\n\n\n\n<p>The multiple y plot gets crowded fast, and I didn&#8217;t want to draw out all those different connections individually, so I contented myself with drawing lines for the top 10, and will come back and put in the effort to build this out in code later, if it makes the final cut. I added color to the top few lines, just to help me follow them across a busy chart. If I were to build this specific chart out for actual use, I\u2019d want to include animation to support a focus task or to select one or two lines to follow as a single narrative, rather than trying to look at everything at once.<\/p>\n\n\n\n<p>I don\u2019t want to get over enthusiastic with my conclusions here, but there is some interesting variability in the line shape for different tools. Excel stays pretty consistent in the top three slots, where other tools are first for some career groups and not even in the top 10 for others. This might suggest specialized tool sets for particular careers (e.g., designers use Illustrator, developers use d3), and that gets right to the heart of the comparisons that I\u2019m trying to make. This is something that seems worth coming back to, when I have all of the data in place.&nbsp;<\/p>\n\n\n\n<p>I also think that it might be interesting to pull information about the relative size of the different tools into this diagram, in addition to their order in the ranking. I&#8217;d like to see how much bigger Excel is than Java, for instance, and if most Java users fall within a particular career group. For now, a quick-and-dirty way to do that is to add a stacked bar chart at the beginning of the diagram, showing the relative proportion of total users for each tool in the chart (in the order that they\u2019re shown in the parallel axis plot). I\u2019d want to do the same for each career, to look at variability in the distribution across groups, but this is enough to remind me to think through that piece when I come back to refine this later.&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">How many different tools do people use?<\/h3>\n\n\n\n<p>So far, I\u2019ve been looking at counts per tool, but it\u2019s also interesting to explore how many tools are listed per user. This is an interesting question on its own, but it will also help me to get a handle on how much duplicate counting I\u2019m doing in the previous charts. One respondent can identify multiple tools in this survey question: if I sum up all of my bars in the counts charts above, I get just north of 4,000 data points, but there are only 1,766 individual responses to the survey, and I know that some of those responses are incomplete.<\/p>\n\n\n\n<p>Fortunately, because of the way the data is structured, getting a count of tools per user is as simple as adding a countA column to the dataset and doing a different pivot off of the same table.&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img decoding=\"async\" src=\"https:\/\/lh6.googleusercontent.com\/OkLz2Btn8OwKctS6vQE9NJMftiajCNfTD5jd_TJiToffDFGLZ_Rg-SKyXiXXFFOukCrjU1DnfhYd0eMDN9vUkJtqCRr4NWNtGVqwcN680Fk_FjMiRLxif3uqweSmjPrq0zU0xpjh\" alt=\"\"\/><\/figure><\/div>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img decoding=\"async\" src=\"https:\/\/lh5.googleusercontent.com\/QC7Mx7lSgtBJ3C9zwpLttxkCDsfnxe9PXq212dT1grId_aoEY4g8hRJJU5OjF2kQ8bih_dSBu_0EGzO2IeRUSAZ2BmHiwp66ZvnjtWfMXy1E2FZElJFAav0yQqvHYTytAk3WCDbX\" alt=\"\"\/><\/figure><\/div>\n\n\n\n<p>As expected, these curves are pretty asymmetrical: lots of people use just a few tools, and then some professions have a long tail of people who use just about everything. Some professions are much more variable than others, and the length of the tail varies a bit as well. In general, most people identified 10 or fewer tools, but there were also a couple of overachievers who ticked off all of the tools that I measured here.&nbsp;<\/p>\n\n\n\n<p>Another interesting thing to keep in mind is that there&#8217;s a strong behavioral component to this data. I&#8217;m sure there were a few people who picked only one or two tools, even if they have used many more in their career, and possibly others who dutifully checked off every single tool they&#8217;ve ever used. There&#8217;s probably some aspect of prioritization, frequency of use, and expertise\/familiarity with the different tools that&#8217;s not captured here, and we have no way to tell precisely how much those variations affect our data.&nbsp;<\/p>\n\n\n\n<p>This is the difference between descriptive surveys and authoritative research. In a formal research setting, you\u2019d put in structures and practices to minimize variation due to personal behavior or preference so that you could draw firm conclusions about a specific question. That\u2019s not the intent of this survey, which attempts only a broader-stroke picture of tools that people use. It would be fascinating to do a follow-up study to dig into the specifics, but here we can only look at the responses that people provide, and interpret those as best we can. For our purposes, it\u2019s&nbsp; important to remember the limitations of this dataset, to consider the potential impacts and implications of those limitations on our analysis, and to be careful not to overstate our results.&nbsp;<\/p>\n\n\n\n<p>There may also be a gap between the tools that people use professionally and what they use in their personal projects. If someone is working on a team, they may not personally use d3.js, but it might be the final form for all of their work &#8211; just implemented by someone else. To really get at those details, we&#8217;d need to add several more questions (and a lot more complexity) to the survey. It&#8217;s always good to keep in mind what questions you can and can&#8217;t answer from the data, and where your questions and interpretation start to run up against the limits of the information that you have, and to ask whether that changes the level of effort that you\u2019re willing to put into exploring a particular point.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What\u2019s the most common number of tools per profession?<\/h3>\n\n\n\n<p>Another way to get at a comparison between groups is to do a median calculation. The median bin is shown as a teal dot in each bar chart above, to give me a point of reference for making sense of the distributions. I can also go for a more aggregated view, and count up the median bins per profession to make a derivative histogram showing the median number of tools for each career. For most professions, the median falls between four and five tools, but there are a couple with medians as low as two or as high as seven tools as well. I would want to look closely at those edge cases in the final analysis, just to make sure that I don&#8217;t have a hidden n-value problem giving me unrealistic medians or otherwise skewing the results.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img decoding=\"async\" src=\"https:\/\/lh3.googleusercontent.com\/VEBMTE9shmHm2DoPTy6AFrYq9x0xdNHxoqaUUmOBXWse5zOrHcrKNZMbuaF8YLMCI7chYaIWxtG7bZV6QLfYKnvqJfW3yl8nletVK83nyI7r5lvmx63N0PMxlRYyR2oI6pcYKJNJ\" alt=\"\"\/><\/figure><\/div>\n\n\n\n<p>Do you feel how much easier it is to trust this chart, with its confident axes and labeled values? Don\u2019t let the representation fool you: this chart is still missing at least half of my data, and that makes count comparisons meaningless at this stage. The more aggregated your representation becomes, the easier it is to miss those important caveats lurking in the details.&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What is the median number of tools per profession?<\/h3>\n\n\n\n<p>Of course, as soon as I make this aggregated chart, I want to see which professions are in the seven tools bin, so I\u2019d probably want to include a breakout of some kind or a supplementary view of the median value per career group if I wanted this chart to become the basis for my final analysis.&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img decoding=\"async\" src=\"https:\/\/lh4.googleusercontent.com\/gTDRdNRoHwCiRXBJE7tfG7UKHXhZxfQNseMTNo1PxjlR-ZFJv1ZPhr-NzJiQkhBPfetJDLfIuAAZX_y070JIUrgIjlGMEHEObDhH3vKPS09XLOX92A_wVft-sBHOJ7xVA16QMMl6\" alt=\"\"\/><\/figure><\/div>\n\n\n\n<p>This is another chart whose interpretation is highly sensitive to n values. I\u2019m comparing across career groups, but not paying attention to how many responses were collected for each one.&nbsp; Some of these \u201cresults\u201d are based on 15 people and others are based on 700. Always keep your eye on the n values: I haven\u2019t validated my categories here, and I\u2019m pretty certain that at least some of my career categories will need to be merged and redefined before I include them here. <\/p>\n\n\n\n<p>Tempting as it is to start comparing values, I have no business making any conclusions at this stage about why people in dataviz might use more tools than business, for example. I shouldn\u2019t even start to speculate. People who are inexperienced with data analysis will often try to extrapolate from partial results and start imagining stories based on \u201cthe data\u201d or \u201cthe trends I\u2019m seeing,\u201d but it\u2019s important to completely deny that urge in the discovery phase. What you put into a chart determines the quality of insight that you get out of it, and \u201cthe data\u201d is only as good as the analysis you\u2019ve done. Getting over-attached to a blip in the numbers will only blind you to the more interesting information that\u2019s really there, and it may lead you to make major mistakes.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What have we learned?<\/h2>\n\n\n\n<p>So, what exactly <em>have<\/em> I learned from doing this exploration, if I can\u2019t trust my counts or make any conclusions based on what I\u2019ve seen? I have:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>A set of questions to choose from, based on the final story that I decide to tell. I can pull from these later, when my core narrative begins to take shape.<\/li><li>An idea of what the analysis will look like for each question.<\/li><li>Notes about important things to look into, and warnings about things to avoid.<\/li><li>A preliminary view into some interesting aspects of the data, and some initial observations to verify as I work through a more complete analysis:<ul><li>Some tools are quite popular, and are identified as important by almost half of survey respondents. Others have only a handful of users.&nbsp;<\/li><li>Some tools are popular across all career groups, while others are more specialized, and common to just a few careers.<\/li><li>The distribution of tools across career groups varies somewhat, but usually in frequency\/count rather than presence, suggesting that there might be interesting variability within career groups that could be worth teasing out.<\/li><li>Small n values complicate the analysis for several career groups, and reflect too many fine distinctions in my first attempt at manually tagging the data. I should consider excluding or merging certain categories, look for another way to improve the counts (merging in data from previous years, etc.), or consider whether these smaller groups can inform a lower-certainty, more qualitative picture.&nbsp;<\/li><\/ul><\/li><li>A list of things to consider next.<\/li><\/ul>\n\n\n\n<p>That\u2019s a lot of information to get out of basic frequency analysis on individual data columns, but the more interesting questions for this data are going to require a bit more work. I knew that from the beginning, but starting here has helped me to get acquainted with the dataset and gives me options to consider when building my project. If the more complex comparisons don\u2019t work out, it won\u2019t be hard to find a new place to start. This basic sense of the data will also help me to evaluate the results that I get and to catch errors, as I wander deeper into more complicated territory.&nbsp;<\/p>\n\n\n\n<p>Comparing back to my design brief, though, I can see that none of these questions has really gotten to the heart of what I\u2019m trying to accomplish (yet). I want to understand how individual people use different tools, and how that maps onto specific skill sets within the career groups. For that, I\u2019ll need to look deeper into the relationships between columns, and across rows. Stay tuned!&nbsp;<\/p>\n\n\n\n<p><strong>Coming up in Education:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Questions, comments, suggestions? Feel free to reach out to <a href=\"mailto:education@datavisualizationsociety.org\">education@datavisualizationsociety.org<\/a> anytime to share your thoughts.\u00a0<\/li><li>Keep your eye out for an Education\/Early career event to talk about data discovery on Saturday April 2. We\u2019re still working out the details, but will announce more via Slack and the DVS newsletter as we get closer to the event.<\/li><li>Are you a data analyst, a dataviz designer\/artist or a dataviz developer\/engineer? This year, the education committee is building out career portraits to help people understand what it\u2019s like to work in these different roles. Please sign up <a href=\"https:\/\/forms.gle\/f3FLFQNB7iDSbpeg6\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a> if you\u2019re interested in supporting our research effort, or otherwise contributing to the project. (Note: we will be asking for additional careers in the coming months, but we\u2019re starting with these three first. If this isn\u2019t you, hold tight!)<\/li><li>Do you have experience in determining statistical significance for survey datasets collected without a control series? The tools visualization is one step in a larger project to map out career portraits using our survey data, and we need to get a sense of how big the variation between groups should be to count as real. We have our initial n values summarized and the basic analysis is done, but we could use some help getting the stats right and assessing feasibility for the more complicated comparisons. Please reach out to education@datavisualizationsociety.org if you know how to help.<\/li><li>Interested in joining the education committee? <a href=\"https:\/\/docs.google.com\/forms\/d\/19Q8IWRYpxKuDo8I0CBqu-qsii528HL4obTyBjGfytqg\/edit?usp=sharing\">Applications are now open<\/a>&#8230;let us know how you&#8217;d like to get involved! <\/li><\/ul>\n<div class=\"cats\"><span class=\"cats__title\">Categories<\/span><a href=\"https:\/\/nightingaledvs.com\/.\/topics-in-dv\/business-intelligence\/\" rel=\"category tag\">Business Intelligence<\/a> <a href=\"https:\/\/nightingaledvs.com\/.\/topics-in-dv\/design\/\" rel=\"category tag\">Design<\/a> <a href=\"https:\/\/nightingaledvs.com\/.\/how-to\/\" rel=\"category tag\">How To<\/a> <a href=\"https:\/\/nightingaledvs.com\/.\/topics-in-dv\/\" rel=\"category tag\">Topics in Dataviz<\/a><\/div><div class=\"tags\"><div class=\"tags__title\">Tags<\/div><a href=\"https:\/\/nightingaledvs.com\/tag\/business-intelligence\/\" rel=\"tag\">Business Intelligence<\/a><a href=\"https:\/\/nightingaledvs.com\/tag\/data-exploration\/\" rel=\"tag\">data exploration<\/a><a href=\"https:\/\/nightingaledvs.com\/tag\/data-visualization\/\" rel=\"tag\">Data Visualization<\/a><a href=\"https:\/\/nightingaledvs.com\/tag\/design\/\" rel=\"tag\">Design<\/a><a href=\"https:\/\/nightingaledvs.com\/tag\/how-to\/\" rel=\"tag\">How To<\/a><a href=\"https:\/\/nightingaledvs.com\/tag\/skill-development\/\" rel=\"tag\">skill development<\/a><\/div>","protected":false},"excerpt":{"rendered":"<p>This article is part II in a series on data exploration, and the common struggles that we all face when trying to learn something new&#8230;<\/p>\n","protected":false},"author":17,"featured_media":10593,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"bgseo_title":"","bgseo_description":"","bgseo_robots_index":"index","bgseo_robots_follow":"follow","cybocfi_hide_featured_image":"","_jetpack_memberships_contains_paid_content":false,"footnotes":"","_links_to":"","_links_to_target":""},"categories":[72,184,63,48],"tags":[173,449,35,43,141,312],"class_list":["post-10579","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-business-intelligence","category-design","category-how-to","category-topics-in-dv","tag-business-intelligence","tag-data-exploration","tag-data-visualization","tag-design","tag-how-to","tag-skill-development"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Step 1 in the Data Exploration Journey: Getting Oriented | Nightingale<\/title>\n<meta name=\"description\" content=\"How to approach a new project, and proof that even the experienced are not immune from the challenges and setbacks of learning.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Step 1 in the Data Exploration Journey: Getting Oriented - Nightingale\" \/>\n<meta property=\"og:description\" content=\"How to approach a new project, and proof that even the experienced are not immune from the challenges and setbacks of learning.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/\" \/>\n<meta property=\"og:site_name\" content=\"Nightingale\" \/>\n<meta property=\"article:published_time\" content=\"2022-03-09T14:00:00+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/02\/Expand-diagram.png?fit=1669%2C913&ssl=1\" \/>\n\t<meta property=\"og:image:width\" content=\"1669\" \/>\n\t<meta property=\"og:image:height\" content=\"913\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Erica Gunn\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Erica Gunn\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"26 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/\"},\"author\":{\"name\":\"Erica Gunn\",\"@id\":\"https:\/\/nightingaledvs.com\/#\/schema\/person\/834a8c0f514a02a96e56c389f0c126f0\"},\"headline\":\"Step 1 in the Data Exploration Journey: Getting Oriented\",\"datePublished\":\"2022-03-09T14:00:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/\"},\"wordCount\":5540,\"publisher\":{\"@id\":\"https:\/\/nightingaledvs.com\/#organization\"},\"image\":{\"@id\":\"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/02\/Expand-diagram.png?fit=1669%2C913&ssl=1\",\"keywords\":[\"Business Intelligence\",\"data exploration\",\"Data Visualization\",\"Design\",\"How To\",\"skill development\"],\"articleSection\":[\"Business Intelligence\",\"Design\",\"How To\",\"Topics in Dataviz\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/\",\"url\":\"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/\",\"name\":\"Step 1 in the Data Exploration Journey: Getting Oriented - Nightingale\",\"isPartOf\":{\"@id\":\"https:\/\/nightingaledvs.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/02\/Expand-diagram.png?fit=1669%2C913&ssl=1\",\"datePublished\":\"2022-03-09T14:00:00+00:00\",\"description\":\"How to approach a new project, and proof that even the experienced are not immune from the challenges and setbacks of learning.\",\"breadcrumb\":{\"@id\":\"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/#primaryimage\",\"url\":\"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/02\/Expand-diagram.png?fit=1669%2C913&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/02\/Expand-diagram.png?fit=1669%2C913&ssl=1\",\"width\":1669,\"height\":913},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/nightingaledvs.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Step 1 in the Data Exploration Journey: Getting Oriented\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/nightingaledvs.com\/#website\",\"url\":\"https:\/\/nightingaledvs.com\/\",\"name\":\"Nightingale\",\"description\":\"The Journal of the Data Visualization Society\",\"publisher\":{\"@id\":\"https:\/\/nightingaledvs.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/nightingaledvs.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/nightingaledvs.com\/#organization\",\"name\":\"Nightingale\",\"url\":\"https:\/\/nightingaledvs.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/nightingaledvs.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2021\/05\/logoDVS-5.png?fit=1988%2C454&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2021\/05\/logoDVS-5.png?fit=1988%2C454&ssl=1\",\"width\":1988,\"height\":454,\"caption\":\"Nightingale\"},\"image\":{\"@id\":\"https:\/\/nightingaledvs.com\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/nightingaledvs.com\/#\/schema\/person\/834a8c0f514a02a96e56c389f0c126f0\",\"name\":\"Erica Gunn\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/nightingaledvs.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/dvsnightingale.wpenginepowered.com\/wp-content\/uploads\/2021\/06\/Erica_Gunn3-e1623373687167.jpg\",\"contentUrl\":\"https:\/\/dvsnightingale.wpenginepowered.com\/wp-content\/uploads\/2021\/06\/Erica_Gunn3-e1623373687167.jpg\",\"caption\":\"Erica Gunn\"},\"description\":\"Erica Gunn is a data visualization designer at one of the largest clinical trial data companies in the world. She creates information ecosystems that help clients to understand their data better and to access it in more intuitive and useful ways. She received her MFA in information design from Northeastern University in 2017. In a previous life, Erica was a research scientist and college chemistry professor. You can connect with her on Twitter @EricaGunn.\",\"sameAs\":[\"http:\/\/www.ericagunn.com\",\"https:\/\/www.linkedin.com\/in\/erica-gunn-7259809\/\"],\"url\":\"https:\/\/nightingaledvs.com\/author\/erica-gunn\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Step 1 in the Data Exploration Journey: Getting Oriented | Nightingale","description":"How to approach a new project, and proof that even the experienced are not immune from the challenges and setbacks of learning.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/","og_locale":"en_US","og_type":"article","og_title":"Step 1 in the Data Exploration Journey: Getting Oriented - Nightingale","og_description":"How to approach a new project, and proof that even the experienced are not immune from the challenges and setbacks of learning.","og_url":"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/","og_site_name":"Nightingale","article_published_time":"2022-03-09T14:00:00+00:00","og_image":[{"width":1669,"height":913,"url":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/02\/Expand-diagram.png?fit=1669%2C913&ssl=1","type":"image\/png"}],"author":"Erica Gunn","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Erica Gunn","Est. reading time":"26 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/#article","isPartOf":{"@id":"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/"},"author":{"name":"Erica Gunn","@id":"https:\/\/nightingaledvs.com\/#\/schema\/person\/834a8c0f514a02a96e56c389f0c126f0"},"headline":"Step 1 in the Data Exploration Journey: Getting Oriented","datePublished":"2022-03-09T14:00:00+00:00","mainEntityOfPage":{"@id":"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/"},"wordCount":5540,"publisher":{"@id":"https:\/\/nightingaledvs.com\/#organization"},"image":{"@id":"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/02\/Expand-diagram.png?fit=1669%2C913&ssl=1","keywords":["Business Intelligence","data exploration","Data Visualization","Design","How To","skill development"],"articleSection":["Business Intelligence","Design","How To","Topics in Dataviz"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/","url":"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/","name":"Step 1 in the Data Exploration Journey: Getting Oriented - Nightingale","isPartOf":{"@id":"https:\/\/nightingaledvs.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/#primaryimage"},"image":{"@id":"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/02\/Expand-diagram.png?fit=1669%2C913&ssl=1","datePublished":"2022-03-09T14:00:00+00:00","description":"How to approach a new project, and proof that even the experienced are not immune from the challenges and setbacks of learning.","breadcrumb":{"@id":"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/#primaryimage","url":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/02\/Expand-diagram.png?fit=1669%2C913&ssl=1","contentUrl":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/02\/Expand-diagram.png?fit=1669%2C913&ssl=1","width":1669,"height":913},{"@type":"BreadcrumbList","@id":"https:\/\/nightingaledvs.com\/data-exploration-step-1-getting-to-know-your-data\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/nightingaledvs.com\/"},{"@type":"ListItem","position":2,"name":"Step 1 in the Data Exploration Journey: Getting Oriented"}]},{"@type":"WebSite","@id":"https:\/\/nightingaledvs.com\/#website","url":"https:\/\/nightingaledvs.com\/","name":"Nightingale","description":"The Journal of the Data Visualization Society","publisher":{"@id":"https:\/\/nightingaledvs.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/nightingaledvs.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/nightingaledvs.com\/#organization","name":"Nightingale","url":"https:\/\/nightingaledvs.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/nightingaledvs.com\/#\/schema\/logo\/image\/","url":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2021\/05\/logoDVS-5.png?fit=1988%2C454&ssl=1","contentUrl":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2021\/05\/logoDVS-5.png?fit=1988%2C454&ssl=1","width":1988,"height":454,"caption":"Nightingale"},"image":{"@id":"https:\/\/nightingaledvs.com\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/nightingaledvs.com\/#\/schema\/person\/834a8c0f514a02a96e56c389f0c126f0","name":"Erica Gunn","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/nightingaledvs.com\/#\/schema\/person\/image\/","url":"https:\/\/dvsnightingale.wpenginepowered.com\/wp-content\/uploads\/2021\/06\/Erica_Gunn3-e1623373687167.jpg","contentUrl":"https:\/\/dvsnightingale.wpenginepowered.com\/wp-content\/uploads\/2021\/06\/Erica_Gunn3-e1623373687167.jpg","caption":"Erica Gunn"},"description":"Erica Gunn is a data visualization designer at one of the largest clinical trial data companies in the world. She creates information ecosystems that help clients to understand their data better and to access it in more intuitive and useful ways. She received her MFA in information design from Northeastern University in 2017. In a previous life, Erica was a research scientist and college chemistry professor. You can connect with her on Twitter @EricaGunn.","sameAs":["http:\/\/www.ericagunn.com","https:\/\/www.linkedin.com\/in\/erica-gunn-7259809\/"],"url":"https:\/\/nightingaledvs.com\/author\/erica-gunn\/"}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/02\/Expand-diagram.png?fit=1669%2C913&ssl=1","jetpack-related-posts":[{"id":14254,"url":"https:\/\/nightingaledvs.com\/step-4-in-the-data-exploration-journey-knowing-when-to-stop\/","url_meta":{"origin":10579,"position":0},"title":"Step 4 in the Data Exploration Journey: Knowing When to Stop","author":"Erica Gunn","date":"December 8, 2022","format":false,"excerpt":"This article is part V in a series on data exploration, and the common struggles that we all face when trying to learn something new. A list of previous entries can be found at the end of the article. I\u2019m exploring the tools data from the State of the Industry\u2026","rel":"","context":"In &quot;Design&quot;","block_context":{"text":"Design","link":"https:\/\/nightingaledvs.com\/.\/topics-in-dv\/design\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/11\/You-are-here.png?fit=1200%2C768&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/11\/You-are-here.png?fit=1200%2C768&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/11\/You-are-here.png?fit=1200%2C768&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/11\/You-are-here.png?fit=1200%2C768&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/11\/You-are-here.png?fit=1200%2C768&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":16780,"url":"https:\/\/nightingaledvs.com\/data-exploration-collaboration\/","url_meta":{"origin":10579,"position":1},"title":"Step 5 in the Data Exploration Journey: The Magic of Collaboration","author":"Erica Gunn","date":"April 17, 2023","format":false,"excerpt":"A good collaborator offers companionship, a fresh perspective, and can help balance skillsets. Here's how to make a collaboration successful.","rel":"","context":"In &quot;How To&quot;","block_context":{"text":"How To","link":"https:\/\/nightingaledvs.com\/.\/how-to\/"},"img":{"alt_text":"Image of the double diamond model which shows the steps in tackling design projects. The leftmost point of the first diamond is \"question, interest or idea.\" The designer then travels through the \"expand\/ideate\" phase to reach the top of the first diamond, which is \"max risk of overwhelm.\" Then, the designer travels down from that top point in the \"focus\/consolidate phase,\" which is the phase in which this column about collaboration is most relevant. The next phases (in the second diamond) are in light gray font, suggesting that those steps are yet to come, but they include \"building\/producing\" and \"deliver\/deploying\").","src":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2023\/04\/Part-5.png?fit=1200%2C661&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2023\/04\/Part-5.png?fit=1200%2C661&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2023\/04\/Part-5.png?fit=1200%2C661&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2023\/04\/Part-5.png?fit=1200%2C661&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2023\/04\/Part-5.png?fit=1200%2C661&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":18311,"url":"https:\/\/nightingaledvs.com\/data-exploration-cut-to-realistic-scope\/","url_meta":{"origin":10579,"position":2},"title":"Step 6 in the Data Exploration Journey: Cut to Realistic Scope","author":"Erica Gunn","date":"August 21, 2023","format":false,"excerpt":"A project to find insights from DVS's State of the Industry Survey data moves from the analysis stages to preparation for build out.","rel":"","context":"In &quot;Data Literacy&quot;","block_context":{"text":"Data Literacy","link":"https:\/\/nightingaledvs.com\/.\/topics-in-dv\/data-literacy\/"},"img":{"alt_text":"The Double Diamond design model has four stages: Ideate, Focus, Build\/Produce and Delivery. In the double diamond, there's an arrow pointing to the very center of the two diamonds, which is right between Focus and Build. The callout text explains that this is where you plan for what to do and how to tackle (at least some of) it.","src":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2023\/08\/image3-2.png?fit=1200%2C658&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2023\/08\/image3-2.png?fit=1200%2C658&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2023\/08\/image3-2.png?fit=1200%2C658&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2023\/08\/image3-2.png?fit=1200%2C658&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2023\/08\/image3-2.png?fit=1200%2C658&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":12264,"url":"https:\/\/nightingaledvs.com\/step-3-in-the-data-exploration-journey-productive-tangents\/","url_meta":{"origin":10579,"position":3},"title":"Step 3 in the Data Exploration Journey: Productive Tangents","author":"Erica Gunn","date":"August 16, 2022","format":false,"excerpt":"This article is part IV in a series on data exploration, and the common struggles that we all face when trying to learn something new. The previous articles can be found here, here, and here. I\u2019m exploring the tools data from the State of the Industry Survey, to illustrate both\u2026","rel":"","context":"In &quot;Business Intelligence&quot;","block_context":{"text":"Business Intelligence","link":"https:\/\/nightingaledvs.com\/.\/topics-in-dv\/business-intelligence\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/08\/2022-07-31-06_41_you-are-here.png?fit=1200%2C759&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/08\/2022-07-31-06_41_you-are-here.png?fit=1200%2C759&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/08\/2022-07-31-06_41_you-are-here.png?fit=1200%2C759&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/08\/2022-07-31-06_41_you-are-here.png?fit=1200%2C759&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2022\/08\/2022-07-31-06_41_you-are-here.png?fit=1200%2C759&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":18346,"url":"https:\/\/nightingaledvs.com\/career-paths-in-data-visualization\/","url_meta":{"origin":10579,"position":4},"title":"Career Paths in Data Visualization","author":"Emily Barone","date":"August 22, 2023","format":false,"excerpt":"Portraits of different career areas, and profiles of people who work in data visualization.","rel":"","context":"In &quot;Career&quot;","block_context":{"text":"Career","link":"https:\/\/nightingaledvs.com\/.\/community\/career\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2023\/08\/image1-copy.png?fit=1000%2C1001&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2023\/08\/image1-copy.png?fit=1000%2C1001&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2023\/08\/image1-copy.png?fit=1000%2C1001&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2023\/08\/image1-copy.png?fit=1000%2C1001&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":21860,"url":"https:\/\/nightingaledvs.com\/beyond-storytelling-with-data-guidelines\/","url_meta":{"origin":10579,"position":5},"title":"Beyond Storytelling With Data: Guidelines for Designing Exploratory Visualizations","author":"Jennifer Frazier","date":"September 11, 2024","format":false,"excerpt":"Visualizations are essential for telling the stories of science. Through visualized data we can be transported to far off nebulas or trace our genetic connections to all life. We can see larger patterns in our environment, watching our seas warm, viral cases rise and fall, and the paths of fires\u2026","rel":"","context":"In &quot;Design&quot;","block_context":{"text":"Design","link":"https:\/\/nightingaledvs.com\/.\/topics-in-dv\/design\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2024\/08\/Beyond-Cover.png?fit=1200%2C675&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2024\/08\/Beyond-Cover.png?fit=1200%2C675&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2024\/08\/Beyond-Cover.png?fit=1200%2C675&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2024\/08\/Beyond-Cover.png?fit=1200%2C675&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/nightingaledvs.com\/wp-content\/uploads\/2024\/08\/Beyond-Cover.png?fit=1200%2C675&ssl=1&resize=1050%2C600 3x"},"classes":[]}],"jetpack_shortlink":"https:\/\/wp.me\/pd2dsI-2KD","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/nightingaledvs.com\/wp-json\/wp\/v2\/posts\/10579","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nightingaledvs.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nightingaledvs.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nightingaledvs.com\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"https:\/\/nightingaledvs.com\/wp-json\/wp\/v2\/comments?post=10579"}],"version-history":[{"count":0,"href":"https:\/\/nightingaledvs.com\/wp-json\/wp\/v2\/posts\/10579\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/nightingaledvs.com\/wp-json\/wp\/v2\/media\/10593"}],"wp:attachment":[{"href":"https:\/\/nightingaledvs.com\/wp-json\/wp\/v2\/media?parent=10579"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nightingaledvs.com\/wp-json\/wp\/v2\/categories?post=10579"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nightingaledvs.com\/wp-json\/wp\/v2\/tags?post=10579"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}