Pharmaceutical Packaging Validation

Validating the safety and effectiveness of packaging for a new weight management product

STUDY TYPE	MY ROLE	TIMELINE	PARTICIPANTS
HF Validation Study	Project lead	August – September 2022	N = 90

Project Background

As a consultant at Bold Insight, I led a human factors (HF) validation study for our client, a pharmaceutical manufacturer, to assess the safety and effectiveness of labeling for a combination product prior to a 510(k) submission. As project lead, I managed four researchers, who worked in teams of two to conduct fieldwork. Fieldwork was conducted in Raleigh, NC and Tampa, FL over a period of three weeks.

At the time of the study, the manufacturer had FDA approval to market the medication for the treatment of Type 2 diabetes under the brand name "Brand A." The manufacturer was preparing a 510(k) submission to expand the intended use of the medication to include weight management under a new brand name, "Brand B," the labeling for which was the subject of the validation study.

The labeling for the product included the product carton and, for patients and caregivers, an autoinjector pen. In addition to the human factors component of the study, the marketing team was also interested in assessing the aesthetic differences between Design 1 and Design 2, shown below.

The two design options for the product, Design 1 and Design 2, applied to labeling for both the pen and the carton.

Device Overview

Intended use: Weight loss and maintenance.
Intended users: Patients with BMI ≥ 30 or BMI ≥ 27 and Type 2 diabetes; caregivers the intended patient group; pharmacists; and pharmacy technicians.
Intended use environment: Home environments (patients and caregivers); pharmacies (pharmacists + pharmaceutical techs).

Research Methods

Objectives

Assess the ability of the product’s intended users (patients, caregivers, pharmacists, and pharmacy technicians) to successfully select their target brand and strength of medication in a variety of selection scenarios.
Determine if there is any difference in task performance between the two potential packaging designs and collect participants’ subjective feedback about both packaging designs.

Creating Use Scenarios

Since the study would need to evaluate whether or not the intended users can select the correct medication strength for their intended use and in their intended use environment, the context of both the use and the environment were both relevant to the task.

Simulated use for every participant, regardless of their user group or task, followed the same workflow, as shown below.

Based on discussions with the client, we knew the following about our users and their use environments:

Pharmacists and pharmacy technicians are always selecting one specific strength of a Brand B as a carton from a refrigerator shelf that also has:
- other strengths of Brand B and
- unrelated product brands on the refrigerator shelf.
Patients and caregivers are using one specific strength of Brand B as both a carton and a pen in a household that may also have:
- other strengths of Brand B,
- any strength of Brand A (given that the medication's contraindications are related), and
- products from the same manufacturer that are not Brand B.

Differences between pharmacy and home use environments.

While the workflow for each task was the same for all participants, the scenarios looked different between pharmacists + pharmacy technicians and patients + caregivers. While the only study variable at the pharmacy is the medication's target strength within the same brand, we had to verify that patients and caregivers at home could safely select the target strength within the same brand as well as differentiate between Brand B and Brand A or competitor brands.

Therefore, the session structure looked like this:

Counterbalance

The counterbalance for this study was complex and multivariate. The following variables were included in the counterbalance:

Variable	Pharmacists + Pharmacy Technicians	Patients + Caregivers
Labeling design	Participants assigned either Design 1 or Design 2	Participants assigned either Design 1 or Design 2
Target medication strength	Within-subject counterbalance of target medication strengths for each task	Within-subject counterbalance of target medication strengths for each task
Carton task scenario order	None – scenario is the same across all tasks	Within-subject counterbalance of carton scenarios between tasks

After my conversations with senior team leaders, I determined the best way to manage the counterbalance was to make a primary document with all possible counterbalances and message each project team before their session to assign them a counterbalance and a participant ID number.

Attempt #1: The Centralized Counterbalancing Chart

Though we needed just 120 participants, there were hundreds of potential combinations for pharmacists/pharm techs:

Then, there were even more for patients and caregivers:

Since all of these combinations were impossible, we decided with our client that our priority would be satisfying FDA requirements for validation testing, which means running a minimum of n=15 participants per user group to reach N=120. As long as we hit n=15 for both designs across all user groups and scenarios, we would be in the clear.

For this approach to work, I needed to create a Centralized Counterbalancing Chart, the document with pre-determined counterbalances ready for me to assign to each room before their session. My goal with this top-down approach was to be the point person for our counts in case we had any participant no-shows and avoid confusion between the two rooms.

Attempt #1 at managing the counterbalance, which I'm calling the Centralized Counterbalancing Chart. To make this work, I had to select a row and assign it to each room prior to each session.

On Day One of fieldwork, we realized quickly that waiting for me to send a counterbalance was a major weak link in our workflow. My attention was already divided across simultaneous sessions, and I became a bottleneck for my teams, leading to panic and frustration.

Attempt #2: The Collaborative Counterbalance Chart

At the end of a grueling 11-hour day, my team voiced their vexation, and we brought on a senior researcher to help us address the problem. She advised us to make the document accessible to everyone so that each team could claim a counterbalance on their own and clearly cross it out when it was used – after all, there were hundreds, if not thousands, to choose from.

Attempt #2 at managing the counterbalance: the Team Counterbalancing Chart. This decentralized system worked far better for managing the complexity of fieldwork by empowering my team members to act independently.

While I wrote our client debrief email, one of my colleagues put her head down and turned the centralized document into a collaborative one. Thanks to her hard work and foresight, our counterbalance was smooth sailing for the rest of the study.

Challenges

Fluctuating project scope. As the package design itself could potentially influence human factors tasks, it was decided during bidding to test both Package Design 1 and Package Design 2 with participants. Given that there were four user groups and that 15 participants per user group were needed to validate the design, this resulted in the study ballooning from N=60 participants to N=120 to evaluate both designs (n=60 for Design 1 and n=60 for Design 2). Our client also was having internal discussions about how many participants to include, which resulted in the N value fluctuating day-to-day during data collection. This uncertainty significantly impacted our product scope.
Massive counterbalances. As discussed in the Research Methods section, the counterbalance alone was a beast to deal with. As the scope of the project increased, so did the size of the counterbalance.
Zero prep time on the first day of fieldwork. Despite the massive amount of physical materials (cartons, pen injectors, A/V equipment, paperwork, etc.) that we were would have to travel with, the fieldwork schedule in our company's project proposal did not include any time for the project team to set up before the start of sessions. Sessions were scheduled to start at 9am on a Monday, and we were able to get access to the building at 7am the same morning at the earliest. This meant that we had less than two hours to unpack and organize our three suitcases of stimuli, set up and test the audiovisual equipment in each of our rooms, build the wire shelves that we bought in town from Target the day before to hold the medication cartons, prepare our study materials, coordinate counterbalances, and introduce ourselves to the facility staff – all before our first participants showed up and before I had ever supervised a day of fieldwork. And then, after doing it in Raleigh, we'd have to do it again in Tampa.

Outcomes

Participants were overwhelmingly able to select the correct target medication. Across hundreds of tasks, there were just two close calls and one use error, with root cause probing showing that all of them were influenced by cognitive slips and not the labeling design.

Over time, it became clear from the data that there was no difference in task success between Design 1 and Design 2. Since participants were able to safely select their target medication regardless of the design, it looked like the manufacturer's marketing team was going to be able to pick whichever they liked best. Two thirds of the way into the study, our client told that the selection had already been made, but they still wanted the data anyway. Unsurprisingly, the design chosen was the one that used less ink.

The client was able to use data collected to demonstrate to the FDA that their product is safe and effective for use. The drug product and its associated packaging have since been approved by the FDA and are now on the market.

Lessons Learned

This project still keeps me up at night, but I learned how to fail forward.

I walked away from this project with two major lessons learned:

I didn't know what I didn't know, and I didn't do a good enough job of figuring it out.
- What went wrong: I tried too hard to maintain control when I should have accepted that I was overwhelmed and communicated that clearly to my senior oversight.
- What I learned: Speak up when you are overwhelmed and uncertain, and give your team the space to do the same.
The leaders who make things appear effortless on the surface have done a lot of work to get there.
- What went wrong: I assumed that the level of uncertainty meant that I would figure things out as I go, which resulted in chaos and frustration.
- What I learned: Preparation takes a lot of work. Anticipate what could go wrong, and be prepared for any possibility. The Dunning-Kruger Effect is real.

That said, there were a number of institutional failures that did not set me up for success. It is easier said than done to say that I should have spoken up about my overwhelm, and quite frankly, I suspect that it would have reflected poorly on me if I did. As I would learn after this project, the culture in which I worked increasingly seemed to be about discipline than about productive communication, and anything less than perfection would not be tolerated.

Still, thinking about how I was failed is less helpful to me than thinking about how I myself failed and asking how I can do better in the future. If I can control only my own actions and attitudes, that is more than enough.

Last updated:

June 23, 2025

NOTE: This project is covered by a non-disclosure agreement that prevents me from giving details about our client, the product, or study artifacts. Given my status as a third-party consultant, I also do not have information about the long-term impacts of my contributions beyond the execution and delivery of our statement of work that isn't also available to the general public.

< Back to Projects

NOHRA MURAD

ENGINEER | RESEARCHER | HUMAN