Understanding and Using the Commonwealth of Pennsylvania's Open Data Portal


About the Pennsylvania Open Data Portal

The Commonwealth of Pennsylvania Open Data Portal was launched in August 2016 with an aim to make government data more accessible to the public to support government transparency and openness, spur social and economic benefits of government data, and empower citizens and businesses to innovate and create with government data. This data can be viewed, analyzed, visualized and exported all on one platform. This is the state's repository for publicly accessible open data owned by state agencies. 
This portal also links to Federal, County and City open data portals, other Geographic Information Systems (GIS) specific Open Data Portals ran by State Agencies or other business entities, and includes links to data from Higher Education and Commissions. 
With so many datasets and many ways to interact with the data, it maybe difficult to know where to start. This page answers common questions on what data assets are available, searching for information, and using the data. Don't forget to check out the additional resources at the bottom of the page.

What is Open Data?

Open data refers to information such as the following: 
  • data that can be freely used, re-used, and redistributed by anyone in a user-friendly format for the public to re-use and consume
  • data is readable by a computer so that it can be filtered, sorted, manipulated and downloaded for use
  • proactive release of publishable state data
  • release of high- quality data with metadata and documentation to promote understanding
  • continuously publish and update state data to foster discovery and trust while using fresh data
As a public resource, open data offers economic, performance, and social value:
  • Offers private business a way to better understand potential markets and integrate government data into new, more innovative products and services
  • Increases government transparency and accountability to its citizens, and 
  • Improves the quality and efficiency of government services by allowing citizens and policy makers to make more informed data-driven decisions.
It is important to note that open data does not include Protected Health Information (PHI) or Personally Identifiable Information (PII) that is protected by law, such as a social security number, driver license number, or medical information.

Useful How To's

We've pulled together some screenshots and videos on this page to help you search for data and use some of the portal's built-in functionality. Some of the information we'll cover includes:
  • Navigating the homepage
  • Searching the open data catalogue
  • How to interact with the data

Navigating the Homepage

Below is an image of the Commonwealth of Pennsylvania Open Data Portal's homepage. We're going to breakdown the homepage section by section. 
Picture of the Welcome page of the Open Data Portal
Picture of the Home page of the Open Data Portal
Picture of the Links in the Footer of the Open Data Portal Home page

Section 1: Menu Links

This section contains several useful links: 
  • Home brings you back to the homepage Data.pa.gov
  • Catalogue takes you to the data catalogue where you can browse through the data
  • User's Guide links to Socrata's videos - we've included some of these videos below for direct access from this page
  • Developers opens a page created by Socrata with resources, support materials, and tips and tricks for developers who want to programmatically access and use open data
  • ≡ Survey ≡ takes you to our Open Data Survey where we hope you will provide responses that we can use to help us improve the Nova Scotia Government's Open Data Portal 

Section 2: Search Box

You can use this box to enter search terms and quickly see a list of matching data assets. For more targeted searches can can always begin with Catalogue Links (Section 3).

Section 3: Catalog Links

These tiles take you directly to the catalogue or to specifically sorted or filtered views. The top row includes a link to the full catalogue, and three sorted views of the data by access frequency, by newly released, and by recently updated. The second row of tiles are category groupings. These filter the catalogue and return only the data related to a specific category. The four categories are Social Services, Business and Economy, Government Administration, and Nature and Environment. Each category can be broken down further on the main catalogue page. 

Find what you need with our Data Catalog

The Data Catalog is the home base for navigating the PA Open Data Portal. This page can be accessed from our HomePage by clicking the "View Catalog" block in the middle of the page or selecting Data Catalog from the menu in the top left corner. The Data Catalog is simply a list of all the content on the PA Open Data Portal, including datasets, visualizations, stories, etc.
Search by any of the following:
  • categories, types, or tags on the left side of the screen
  • State Agencies, Commissions, Federal or Local Business Owners
  • use keywords in the search bar at the top of the page

How to Search and Browse the Catalogue

This catalogue is the gateway to all of the data and visualizations, such as charts and maps, that can be found on the Commonwealth of Pennsylvania Open Data Portal.
Picture of how to use the Search Box in the Data Catalog page

Section 1: Search Box

At the top of the main catalogue is a search box. Any terms you enter here will be searched for matches in the title, description, category, and tags. Socrata uses stemming search functionality which means searches use the root stem of each word. So if you search for the work "funding" your results will also include data with the words "funds", "funded", and "fund". This broadens your search results to help you find the data you are looking for.

Section 2: Search Results

The results of your search are displayed here. You can click on each data listing to view, and if desired, download the data. the default display for the search results is by most relevant. the box in the upper right-hand area of this section allows you to change the sorting of the results to alphabetical, by most accessed, by recently added, or by recently updated.

Section 3: Filter Options

There are four different filter options you can select to refine your search and create more targeted results. You can apply multiple options as needed but only one for each type of filter.
1. Categories
Data is grouped according to one of four categories described earlier: Social Services, Business and Economy, Government Administration, and Nature and Environment. These can be broke down even further in the catalgoue by clicking the down arrow next to a category and selecting on of the subcategories.
2. Types
The open data portal contains a variety of data types such as datasets, maps, charts, and filtered views. If you are looking for a particular type of asset, like a map, you can apply a filter to search for only those types of assets.
3. Departments
All departments or agencies with data currently released on the portal can be found here. Simply click the "Show All..." button to expand the view and select the department from the list.
4. Tags
Tags, or keywords, are applied to data when they are uploaded to the portal and help to further categorize the listing for easier search and retrieval. Although similar to categories they are more specific to the actual data and its context.
Now that you are more familiar with the Open Data Portal and how to search for data, here are a few videos that demonstrate how to interact with data and explore some of the additional Open Data Portal's Features.

View stories, dashboards, and other asset types

Data stories (or dashboards) are a simple way to contextualize and highlight data on the Open Data Portal. They make it easy to interact with data and understand context and trends. Data stories are a way we can explain or analyze an issue through a combination of text and dynamic visuals. One example data story on the PA Open Data Portal is the PA Employment First Story page

Preview and interact with the data

Once you click a dataset within the catalog, you will be taken to the dataset landing page or a 'primer' page, which provides a description of the dataset and other metadata. Metadata refers to information or descriptions about the data. It describes the data and provides an understanding of what the data is and how to use it.
From the Primer page, you can export the data, access the API endpoint, or get started on creating your own data visualization. 
On a Primer page, you’ll find several areas highlighting aspects of the dataset including this information: 
  • At the top of the page you’ll find key metadata including the dataset title, description and the data provider. 
  • About this Dataset section - This section gives further insight into creating the dataset with fields such as the following:
    • Updated - This will use the most recently updated time of Date Last UpdatedMetadata Last Updated, and Date Created.
    • Date Last Updated - The most recent date of when the data was updated.
    • Metadata Last Updated - This date of the most recent update to any information about the dataset.
    • Date Created - This is the date of when a dataset is published for the first time. 
    • Downloads & Views - The number of times a dataset is visited or downloaded.
    • Update Frequency, Business Owner, Category and Tags round out the information. 
  • What’s in this dataset? - Get a quick snapshot of the total rows, total columns and what each row of data represents.
  • Columns in this dataset - Provides a Data Dictionary for each column in the dataset. 
  • Table Preview - It is possible to view all the rows of the dataset right from the landing page moving through about 14 rows at a time. If you want to filter the dataset, see more rows at once or create a visualization, proceed to the table view using the View Data button at the top of this section or at the top of the page.
  • Related Content Using this Data - Links to all visualizations or stories that was created using this dataset. 

Navigating a Dataset

Once you found an interesting dataset, chart, or map to look at, what can you do on the page? 
For reference, "View" means any filter, chart, map, etc. that is "Saved" from a dataset.
1. Title: This is the title given to the dataset or view. Views, such as charts, can have a different name from the dataset it is based on.
2. Based On: If you are on a view, this will link to the dataset that the view is based on. For example, if you create a pie chart this would link back to the dataset.
3. Description: This is the description given to the dataset or view.
4. Dataset View: These buttons in the top right allow you to switch between the different views of a dataset or view. You can have more than one view splitting the screen vertically at a time
  • View as a table: This view displays each column and row in a tabular format (seen in example above)
  • View as a rich list: This view displays the details for each row grouped together, for example columns may be in a vertical list.
  • View as a single row: This view displays all the details for one row at a time, There are arrows to scroll through next and previous rows.
5. Find in this Dataset (Grid View) or Search Box (Explore View): Enter in a word or words to search within the dataset.
Thank You for visiting the Commonwealth of Pennsylvania Open Data Portal. 
Please Click Here to provide us information on how you use open data, your visit today, and your use of open data portals in general. This information will help to inform us on ways we can improve the portal to better meet visitor's needs.
Additional Information: