top of page

Driving Product Confidence in 1 Day with AI-Assisted UsabilityTesting

Conducted unmoderated testing to evaluate tap accuracy in dense charts and validate design decisions against stakeholder feedback

Role
UX Designer
Timeline
1 day
Team
UX, Product, Architects
Tools used
Figma, Dscout, ChatGPT
savings.png

Summary

During a design review of the fuel savings chart, stakeholders raised concerns about tap accuracy on tightly spaced bars and proposed solutions like adding extra steps or scrolling.

​

These changes would have increased interaction cost and complexity.

To validate the concern with real users, I ran an AI-assisted unmoderated usability test, creating screeners and tasks with ChatGPT and conducting the study on Dscout, all within a single day.

ChatGPT Image Mar 21, 2026 at 12_04_52 PM.png
  • No usability issues were reported around tap accuracy

  • 7/7 users successfully interacted with the chart without any mis-taps

  • Users completed tasks confidently and without friction

Outcome

  • Avoided unnecessary feature changes (extra steps, scrolling behavior)

  • Reduced decision-making time from days/weeks to a single day

  • Enabled a data-backed design decision, aligning stakeholders quickly

Impact

Challenge

  • What should the right screener look like to ensure relevant participants (mobile users, touch interaction)?

  • How should tasks be framed so users behave naturally, without bias toward tapping?

  • Would participants clearly understand the task in an unmoderated setup without guidance?

Key challenges included:

This was my first time evaluating tap accuracy in a dense interface.

Unlike typical usability tests, I needed to design a study that captured interaction precision (mis-taps, accuracy, confidence) without biasing users.

ChatGPT Image Mar 21, 2026 at 06_56_30 PM.png

All of this had to be done quickly while maintaining a reliable and high-quality study.

My Approach

To validate the design quickly, I needed to set up an unmoderated usability test end-to-end within a day, including recruitment and task design.

1. Designing the Screener (with AI + Judgment)

I started by referencing my previous screener questions and used them as a base.
Then, I provided context and the testing goal to ChatGPT to generate a new screener.
However, the initial output was misaligned with my objective. It focused on discoverability instead of tap accuracy and interaction precision.


Instead of using it directly, I:

  • Identified the gap in interpretation

  • Refined my prompt to clearly emphasize tap accuracy in dense UI

  • Iterated with ChatGPT until the output aligned with my goal

This helped me quickly arrive at relevant screener and knockout questions tailored to the study.

Screenshot 2026-03-20 at 10.36.36 AM.png
Screenshot 2026-03-20 at 10.38.46 AM.png
Screenshot 2026-03-20 at 10.40.30 AM.png

2. Recruiting Participants

I launched the screener on Dscout and was able to quickly recruit the right participants, ensuring:

  • Mobile-first users

  • Comfortable with touch interactions

3. Crafting the Usability Mission

Next, I used ChatGPT to generate a usability test mission.
The initial response included multiple scenarios and task variations.
I curated the most relevant ones and refined them further through multiple iterations.


Key improvements I focused on:

  • Making the task natural and goal-driven (not instructing users to “tap”)

  • Ensuring clarity for unmoderated execution

  • Reducing ambiguity so users could complete tasks independently

I continuously iterated with ChatGPT, refining prompts and outputs until the mission was clear, unbiased, and aligned with the testing objective.

My prompt

Screenshot 2026-03-21 at 5.50.18 PM.png

Response I took

Screenshot 2026-03-21 at 4.49.55 PM.png

Response I rejected

Screenshot 2026-03-21 at 6.30.20 PM.png

Execution

I launched the study on Dscout as an unmoderated usability test, enabling quick turnaround and real user interaction without scheduling constraints.

  • 7 participants completed the study successfully

  • All participants used mobile devices, ensuring realistic touch interaction

  • The study captured screen recordings, task responses, and verbal feedback (think-aloud)​

  • ​

This setup allowed me to observe:

  • How users interacted with tightly packed bars

  • Whether they experienced mis-taps or hesitation

Screenshot 2026-03-20 at 11.11.40 AM.png
Screenshot 2026-03-20 at 11.15.43 AM.png
Screenshot 2026-03-20 at 10.59.54 AM.png

Outcome

The usability test confirmed that tap accuracy was not an issue, even with tightly spaced bars.

  • 7/7 participants successfully tapped the correct months without any mis-taps

  • All participants were able to complete the interaction quickly and without hesitation

  • Screen recordings clearly showed high tap precision, even for closely placed bars

​

However, some interesting behavioral patterns emerged:

  • Most participants evaluated the experience holistically, commenting on:

    • Year selection

    • Amount of data shown

    • Overall usability

​

  • Instead of focusing only on the tapping interaction

  • One participant did not follow the task sequence (September → December → November), which led to confusion and frustration

    • Despite this, their recording showed accurate and effortless tapping, reinforcing that tap precision was not the issue

​

Overall, the results provided strong evidence that:

  • The current design supports accurate and reliable interaction

  • The perceived risk of mis-taps raised by stakeholders was not observed in real user behavior

Recordings of participants

Learnings

Task clarity is critical in unmoderated testing

Even a small misunderstanding (like task order) can:

  • Break the flow

  • Create unnecessary frustration

Clear and simple instructions are essential, especially when no moderator is present.

Behavior > Self-reported feedback
Although one participant reported difficulty due to task confusion,
their actual interaction behavior (recordings) showed no issues with tapping.
This reinforced the importance of:

  • Observing real behavior

  • Not relying only on verbal or written feedback

bottom of page