Program Testing: How Programmers Find Errors and Prove a Program Works
Unit 1 · Concept 4 of 5 · ~7 min read · 10–13% of AP exam
By APScore5 Editorial Team · AP CSP Subject Specialists
Run your program with normal, edge, and invalid inputs — then use what you find to make the program better. Built for AP CSP students preparing for the Create Task and unit MCQs.
Program testing is the process of running a program with different inputs to check whether it works the way you planned. Testing finds bugs before users do — and on the AP CSP Create Task, your testing description is one of the 6 rows on the rubric. Good testing uses normal cases, edge cases, and invalid inputs.
Updated May 2026Reviewed by APScore5 Editorial TeamAP Computer Science Principles Big Idea 1
4 test case typesRow 6 of Create Task rubricTesting ≠ debuggingTested in MCQs
Program testing is running your program with different inputs to check whether it works correctly. The goal is to find bugs early — while they're easy to fix — instead of letting users find them later. Strong testing covers normal cases, edge cases, and inputs the program wasn't designed to handle.
Every program has bugs. Real programmers don't avoid bugs — they find them quickly and fix them. Testing is the tool that makes that possible. When you test, you're not trying to prove your program works. You're trying to break it, because every break you find is a break a real user won't find later.
Figure - AP CSP Program Testing Case Types
Testing matters on the AP CSP exam in two specific ways. First, MCQs often describe a testing scenario and ask which input would best reveal a bug, or which test case would prove the program works. Second, Row 6 of the Create Task rubric asks you to describe two specific procedure calls with different arguments and outcomes — that's testing.
Testing is part of the larger program development process and closely tied to iterative development. You don't test once at the end. You test after every small change, find what's broken, fix it, then test again.
It also sits alongside collaboration in computing: partners often design tests together, compare expected outputs, and re-run cases after a fix. A shared test log keeps everyone aligned on what "working" means before you write program documentation for graders or users.
Why program testing matters on the AP CSP exam
Unit 1 MCQs frequently describe a short program and ask which input would best expose a bug, or whether a described action is testing or debugging. The Create Task dedicates an entire rubric row to how you tested two procedure calls. Treat testing as evidence, not a checkbox: you are showing that you ran the program, compared outputs, and learned something specific from each run.
Test types
Types of Test Cases
There are 4 main types of test cases: normal cases (expected inputs), edge cases (boundary values), invalid cases (unexpected inputs), and repeated cases (the same action done many times). Strong testing uses all four.
Figure - Program Testing Workflow AP CSP
Type
What it tests
Example for a quiz app
Normal case
Inputs you expect users to give
User types the correct answer
Edge case
Inputs at the boundary of allowed values
User enters 0 questions or the maximum number
Invalid case
Inputs the program wasn't designed to handle
User leaves the answer blank or types a number where text was expected
Repeated case
The same action many times in a row
User clicks "Next" 100 times rapidly
Why each type matters
Normal cases prove the program works for typical use. If a quiz app doesn't handle a correct answer, nothing else matters.
Edge cases find bugs hidden at boundaries — the first item in a list, the last item, the largest allowed value, the empty case. Most bugs hide here.
Invalid cases test how the program handles surprise inputs. Real users will hit the wrong button, type the wrong thing, or leave fields blank. A strong program either prevents these or handles them gracefully.
Repeated cases find bugs that show up over time — memory issues, score that doesn't reset, lists that grow without limit.
Quick reference: four test case types
Normal case — expected inputs (correct answer on a quiz question)
Edge case — boundary values (zero questions, maximum score, first or last item in a list)
Invalid case — unexpected inputs (blank answer, wrong data type)
Repeated case — same action many times (rapid clicks on "Next")
On MCQs, read the scenario for which input was used — that usually tells you the type without memorizing definitions in isolation.
Compare terms
Testing vs Debugging
Testing finds problems. Debugging fixes them. Testing answers "does this work?" — debugging answers "why doesn't this work, and how do I make it work?". You can't debug what you haven't tested.
Figure - Input Test Fix Program Cycle
Term
What it means
When you do it
Testing
Running the program to check whether it works
After every small change
Debugging
Finding and fixing the cause of a bug
After testing reveals a problem
Example flow:
You write a procedure that updates a score
You test it by entering a correct answer — the score doesn't increase
You debug by reading the code, find that you forgot to call the procedure, and add the missing line
You test again — the score increases correctly
You test an edge case — what if the user enters the answer with extra spaces?
The test fails. Back to debugging.
Exam tip: testing vs debugging on MCQs
If the stem asks what a student is doing when they run the program with chosen inputs, the answer is testing. If the stem asks what they do after a test fails — read code, add a line, fix a variable — the answer is debugging. Many wrong answers swap the two terms; read whether the action finds the bug or fixes it.
🎯
How this shows up in the Create Task
Row 6 of the Create Task rubric is dedicated to testing. You'll describe two specific procedure calls — with different arguments — and explain what happened. Most students lose this point by writing vague tests ("I tested it and it worked"). Strong testing means specific arguments, specific outputs, and a clear difference between the two calls.
A good test case has 5 parts: the input you're using, the expected result, the actual result, whether it passed or failed, and what you'll change if it failed. Writing this down — even in a simple table — is the difference between weak testing and strong testing.
Figure - Find Bugs Before Users Guide
Test name
Input
Expected output
Actual output
Pass/Fail
Action
Correct answer
"iteration"
Score increases by 1
Score increases by 1
Pass
None
Wrong answer
"wrong word"
Score stays same
Score increases by 1
Fail
Fix the comparison logic
Blank answer
(empty)
"Please enter an answer"
Program crashes
Fail
Add input validation
Same answer twice
"iteration" then "iteration"
Score increases by 1, then 1 more
Score increases by 2 total
Pass
None
Tips for the Create Task specifically
Pick two test cases where the arguments are different AND the outputs are different
Use specific values, not vague descriptions
Show the procedure being called both times — not the whole program
Match each test to a result you can point to
Official Big Idea 1 standards on College Board AP Central describe how computing innovations are developed through collaboration and iteration — testing is how you prove each iteration actually works before you document or ship it.
Input validation and failed invalid tests
When an invalid test fails, the fix is often input validation — code that checks input before the main logic runs (for example, rejecting a blank string or showing a message instead of crashing). Validation is not the same as testing: you test to discover the need; you add validation during debugging. Re-test the same invalid input after you add validation to confirm the program now handles it gracefully.
Hands-on
Try It — Spot the Test Case Type
Read each scenario and pick the test case type. You'll get immediate feedback and a short explanation — even when you're right. Your progress is saved in this browser session, so you can refresh and continue where you left off.
Miss a question? Re-read the four types in the table above, then use Try again at the end to run all six scenarios again.
Question 1 of 6
Review any missed types in the table above, then try the AP-style questions below.
Avoid traps
Common Testing Mistakes
These patterns show up in student code, Create Task written responses, and released MCQs. Fixing them early saves points on Row 6 and on unit exams.
Mistake
Why it costs points
Better approach
Testing only correct inputs
You miss bugs at boundaries and edges
Include at least one edge case and one invalid case
Testing only at the end
Bugs compound; debugging gets harder
Test after every feature you add
Vague test descriptions
Row 6 of the rubric wants specific arguments and outputs
Write: "I called checkAnswer('iteration', 'iteration') and got TRUE"
Skipping the "expected result"
You can't tell if a test passed or failed
Always write what you expected before you run the test
Two tests with same outcome
Row 6 wants different outcomes from different arguments
Make sure your two test calls produce visibly different results
Treating debugging as testing
They're different steps
Test → find bug → debug → test again
After you fix a failed test, always re-run the same test plus at least one edge or invalid case — fixes often break something else at the boundary.
Timed reasoning
AP-Style Practice
These three questions mirror common Unit 1 MCQ patterns: incomplete test coverage, classifying test case types, and Create Task Row 6 requirements. Click an answer to see whether you are correct and read the explanation.
When you review your own Create Task draft, ask the same questions: Did I test only normal inputs? Did I describe two calls with different arguments and different outcomes?
1. A student tests their program only with the inputs they expect users to give. What is the main weakness of this testing strategy?
Easy
Answer: B. Testing only normal inputs leaves bugs hiding in edge and invalid cases. Real users hit edges and unexpected inputs all the time.
Why D is wrong: Documentation isn't directly affected by which inputs you test. The problem is incomplete coverage.
2. A program crashes when a user enters a blank answer. What type of test case revealed this bug?
Medium
Answer: C. A blank answer is input the program wasn't designed to handle, which makes it an invalid case. The fix would be input validation — check for blank before processing.
3. A student runs their procedure twice with the same argument and gets the same output both times. They use this as one of their two test cases on the Create Task. Why is this a weak choice?
Hard
Answer: B. Row 6 specifically asks for two procedure calls where the arguments differ AND the outputs differ. Same arguments + same outputs proves the procedure runs but doesn't prove the parameter does anything.
Confidence gate
What You Can Now Do
Tick each line when you can explain the idea without looking at the tables above. When all five are checked, you are ready for the next Unit 1 concept — program documentation — where you explain your code to graders and future users.
0 of 5 ready
Quick answers
Frequently Asked Questions
What is program testing?
Program testing is the process of running your program with different inputs to check whether it works the way you planned. The point is to find bugs before users do — and the earlier you find them, the easier they are to fix. Strong testing uses several types of inputs: normal, edge, invalid, and repeated.
What are the types of test cases?
The four main types are normal cases (expected inputs), edge cases (boundary values like 0 or maximum), invalid cases (unexpected inputs like blanks or wrong types), and repeated cases (the same action done many times). A good testing strategy uses all four, not just normal cases. Most bugs hide in edge and invalid cases.
What's the difference between testing and debugging?
Testing finds problems. Debugging fixes them. You test by running the program with chosen inputs and comparing actual output to expected output. You debug by reading the code, finding the cause of a failure, and changing the code to fix it. You can't debug what you haven't tested.
What is an edge case?
An edge case is a test input at the boundary of allowed values — the smallest input, the largest input, the first item in a list, the last item, or an empty list. Edge cases are where most bugs hide because programmers usually code for the middle, not the edges. Always include at least one edge case in your testing.
Why should I test with invalid inputs?
Real users will type the wrong thing, leave fields blank, or click the wrong button. If your program crashes on invalid input, the user experience is broken and your Create Task video might fail to demonstrate the program. Testing with invalid inputs lets you add input validation that handles surprises gracefully.
How many test cases should I have on the Create Task?
For Row 6 of the rubric, you only need to describe two procedure calls — but make sure the arguments differ AND the outputs differ. While building, you should test much more than that. Most students who score full marks ran at least 10 different tests during development and picked the 2 strongest to write about.
How does program testing connect to the AP CSP Create Task?
Row 6 of the Create Task rubric is dedicated to testing. You'll describe two specific procedure calls with different arguments and explain what happened in each case. Students who tested throughout their development have stronger material to write about than students who tested once at the end.
What makes a good test case?
A good test case has 5 parts: the input being used, the expected output, the actual output, whether the test passed or failed, and what you'll change if it failed. Writing this down — even in a simple table — turns testing into proof. Vague tests like "I tested it and it worked" lose points on the rubric.