Logo: TechTrax...brought to you by MouseTrax Computing Solutions

Access Data "Clean-up" Tips

by Beth Melton, MVP, MOS Master Instructor
Skill rating level 7.

I know I promised my next article would be on Table design properties; however due to the requests I received on how to normalize data in an existing Access database, this article covers some of the methods I implement when faced with this situation.

If you recall from my previous article, Normalizing Access Data, we started with an Employee table consisting of the following fields: EmpID, Name, Manager, Department, Sector, and Spouse/Children:



Figure 1—Employee Table not yet in 1NF

After Normalization the table was broken into smaller fields, repeating groups were removed, and additional tables were created for data that applied to more than one record. The following four tables and fields were the end result:

Table

Fields

Employees

Emp ID, First Name, Last Name, Dept Code

Dependants

Emp ID, Dependant, Relationship*

Departments

Dept Code, Department, Sector, Manager ID

Managers

Manager ID, Manager

* The Relationship field was added to the Dependants table for this article.



Figure 2—Completed Database Design

Data Clean-up Techniques
Now how do we implement this design in Access? Some may think all of this looks good on paper but taking existing data and converting it into normalized tables would result in too much work.

This is not the case if you know how to utilize action queries in Access and other utilities found in Excel. Between both applications I have successfully normalized databases consisting of several thousand records without reentering any of the data.

The rest of this article will walk through how our initial table can be converted into the completed database design in a few simple steps. You can download the zipped NormalizationTips.mdb (NormalizationTips.zip [8k]) if you would like to work through this tutorial.

How to Break Each Field Down to the Smallest Meaningful Value
I have found the easiest way to create multiple fields from one field is to manipulate the data in Excel. You can use the Text to Columns Wizard to quickly accomplish this task.

  1. Select the Employees table in Database window.
  2. Click Tools in the menu bar, point at Office Links and select Analyze it with MS Excel to create a new Excel workbook:



    Figure 3—Example worksheet in Excel

  3. Next, separate the Name field into a FirstName field and a LastName field:

         A) Right-click column heading C and select Insert.
         B) Select column B then click Data in the menu bar and select Text to Columns…
         C) Select the option for Delimited and click Next.
         D) In the Delimiters section tick Space and then click Finish.
         E) When prompted, answer OK to replace the contents of the destination cells.
         F) Change the text in cell B1 to reflect FirstName and the text in cell C1 should read LastName.

  4. Split the Spouse/Children names using the same steps except use a Comma as the delimiter and supply appropriate field names such as Child1, Child2, and Child3.

    Additional Note:
    If your data does not have a common delimiter, or if you have a list of names in which some records contain a middle initial or suffix while other records do not, take a look at I have a "Name" column which I want to split into "FirstName", "LastName"—how can I do it? for additional methods.

Import Excel Data into Access
Now that the Name and Spouse/Dependant fields have been divided into individual fields you need to import the worksheet back into Access.

  1. Create a new database and import the Employees worksheet:

         A) Click File in the menu bar, select New Database, use a Blank Database and save the file in a location of your choice.
         B) Click File in the menu bar, point at Get External Data and click Import.
         C) Change the Files of Type drop down to reflect Microsoft Excel.
         D) Locate the Employees.xls workbook you created previously.
         E) Step through the Import Wizard for the Employees worksheet using the following guidelines:
             * Turn on First Row Contains Column Headings
             * Store data in a new table
             * Do not set a Primary Key

    Additional Note:
    An alternative method for importing the worksheet data is to select A1:J4 on the Employees worksheet and copy/paste it on the Tables object list in the database window.

How to Remove Duplicate Records
The Departments table and Managers table need to be created however two employee records contain duplicate department and manager data. We can create both tables and remove the duplicate records by using an action query called Make Table Query.

  1. Create a new Make Table query based on the Employees table:

         A) Select the Employees table in the database window.
         B) Click Insert in the menu bar and select Query.
         C) Verify Design View is selected and click OK.
         D) Double-click Department and Sector to add the fields to the QBE grid.
         E) Click Query in the menu bar and select Make Table Query…

  2. Modify the query properties to select unique data values:

         A) Right-click in an empty area in the upper half of the Query window and select Properties.
         B) Locate the property for Unique Values, change the setting to Yes, and close the Properties.

  3. Run the query to create the Departments table:

         A) Click Query in the menu bar and select Run
         B) Confirm the Make Table Query action query.

  4. Modify the Make Table query to create the Managers table and replace the Sector field with the Manager field:

         A) Click Query in the menu bar and select Make Table Query…
         B) Type Managers in the Table Name combo box.
         C) In the Field row (first row) change the Sector field to Manager
             (use the drop down next to the Sector field name in the QBE grid).
         D) Run the query and confirm the action query.
         E) Close the query without saving changes.

    Recap and Additional Notes:

    • Create a Make Table Query.
    • Add the table(s) that contains fields you want to use in the new table.
    • Add only the fields that will be included in the new table to the QBE Grid.
    • Use the Unique Values query property to remove duplicate data values.
    • Run the query to create the new table.

Remove Repeating Groups of Data in Access
The Employees table contains four fields, Spouse, Child1, Child2, and Child3, which are considered repeating groups and they do not belong in the Employees table. A new table should be created and the four fields need to be combined into one field.

Additionally, the Relationship field will be added and populated.

This objective can be easily accomplished using a Make Table Query and an Append Query.

  1. Create a Make Table Query and add a Relationship field:

         A) Select the Employees table.
         B) Click Insert on the menu bar and select Query.
         C) Make sure Design View is selected and click OK.
         D) Double-click the EmpID and Spouse fields to add them to the QBE grid.
         E) In the criteria cell below Spouse type: Is not Null
         F) Click in the third field cell and type: Relationship: "Spouse"
         G) Click Query in the menu bar and select Make Table Query…
         H) Type Dependants in the Table Name combo box and click OK.
         I) Run the Make Table Query.

  2. Change the query type to an Append Query to create new records for the repeating groups:

         A) Click Query in the menu bar and select Append Query…
         B) Select Dependants from the list of table names and click OK.
         C) Change the Spouse field to Child1.
         D) Leave the Is Not Null criteria below the Child1 field.
         E) In the Relationship field change "Spouse" to "Child".



    Figure 4—Append Query

         F) Run the Append Query.
         G) Change Child1 to Child2 and run the query.
         H) Do the same for Child3.
         I) Close the Query without saving changes.

  3. Delete the Spouse and Child fields from the Employees table:

         A) Select the Employees table in the database window and click Design
         B) Select the Spouse, Child1, Child2, and Child3 fields and press Delete on the keyboard.

    Recap and Additional Notes:
    • In your query add only the fields that will be included in the new table to the QBE Grid.
    • Add the related field and the first field in the repeating group.
    • Use Is Not Null criteria (for repeating field) to prevent empty data values from being added to the new table.
    • Run the query to create the new table.
    • Change the Query Type to an Append Query.
    • Change field for the first repeating group to the second field in the repeating group.
    • Run the Append Query.

How to Add a Related Field to Existing Tables
Both the Departments table and Managers table could use a better field to establish relationships between other tables. For this example we will use an AutoNumber data type to make the data entry side easier.

This objective is accomplished by adding a new field to both tables and using an Update Query to populate the data values in the related field.

  1. Modify the Department table and create a new field called DeptCode:

         A) Select the Department table in the database window and click Design.
         B) Click Insert in the menu bar and select Rows.
         C) Name the new field DeptCode.
         D) Change the Data Type to AutoNumber.
         E) Save and close the table.

  2. Create a related field for DeptCode in the Employees table:

         A) Open the Employees table in Design View and insert a new field.
         B) Name the new field DeptCode.
         C) Change the Data Type to Number and verify the Field Size is set to Long Integer.
         D) Save and close the table.

  3. Create a new query based on both the Department table and Employee table:

         A) Select the Queries object list and click New.
         B) Select Design View if necessary and click OK.
         C) Add both the Department table and Employee table.
         D) Close the Show Table dialog box.

    Note: If you see a join line connecting DeptCode in both tables right-click the join line and select Delete. If two fields are named the same Access may automatically relate them in your query. This is called an AutoJoin.

         E) Drag the Department field from the Departments field list and drop
             it on the Department field in the Employees field list to create a join between them.
         F) Double-click DeptCode from the Employees table to add the field to the QBE grid.
         G) Click Query in the menu bar and select Update Query.
         H) Below the DeptCode field from the Employees table locate the Update to cell and type: [Departments].[DeptCode]
         I) Run the Update Query.
         J) Close the query without saving changes.

  4. Delete the Department and Sector fields from the Employees table since they are no longer necessary. DeptCode will be use to relate the Departments table to the Employees table.

  5. Perform Step 3 for the Mangers and Departments tables but use ManagerID as the field name and the Department field can be used to relate the tables in the query.

    Recap and Additional Notes:
    • Use AutoNumber data type if you do not have a predetermined ID.
    • If using a predetermined ID add the data values to each record in the table after adding the field in the primary table.
    • Modify the related table and add a new field to hold related values.
    • If primary field is an AutoNumber data type the related field must be a Number data type and field size must be Long Integer.
    • Create a new Query and add both the Primary table and Related table.
    • Join the tables by a common field(s).
    • Add the Related field to the QBE grid.
    • Modify the query to create an Update Query.
    • In the Update to cell add the table name and field name from the Primary table. For example: [Departments].[DepartmentCode].
    • Run the Update Query to populate new field.
    • Open Related table and verify new data values.

Additional Table Design Modifications
After making the following changes the database design will be complete.

  1. Delete the Department field from the Managers table.

  2. Delete the Managers field from the Employees table.

  3. Rename the Spouse field to Dependants in the Dependants table.

Final Touches
If you want to verify the data in the tables we created you can use two select queries. One query will view the employees and their dependants and the second query will view the employees, department information, and manager.

  1. Employees and Dependants Query:

         A) Create a new query and add the following tables: Employees, Dependants
         B) Add the following fields: EmpId, FirstName, LastName, Dependant, and Relationship.
         C) Relate the tables using the EmpId field if necessary.
         D) Run the query to view the Result set:



    Figure 5—Employees and Dependants Result set

  2. Employees and Departments Query:

         A) Create a new query and add the following tables: Employees, Departments, Managers
         B) Relate the tables using their common field if necessary
         C) Add the following fields: EmpId, FirstName, LastName, Department, Sector, and Manager
         D) Run the query to view the Result set:



    Figure 6—Employees And Departments Result set

Summary
The key to normalizing your Access data is to use the tools supplied in the application. As you may have found, action queries can be quite powerful if implemented correctly. Granted the Table Analyzer Wizard found under the Tools menu was designed to help you normalize data but it has its limitations. For example it can not break fields down to their smallest meaningful value nor can it populate new fields as you create new tables. Not to mention your tables are the most important object in Access. You should know everything that goes in to the table design and not rely on Wizards to do the work for you. I'm not saying Wizards aren't great but not where tables are concerned.

For those following this series my next article will be on Naming Conventions, Data Types and Table Properties, and more on Relationships (Access relationships of course!).

 

 

Go up to the top of this page.
This site powered by the Logical Web Publisher (TM): Fast, easy, and affordable content management