25 shell scripting interview questions for DBAs, with scenario-based answers on backups, log parsing, exit codes, cron jobs, and safe production automation.
Shell scripting interview questions for DBAs aren't really about Bash. They're about whether you understand the operational job well enough to automate it safely — and that's a meaningfully different test than being able to recite what `$@` expands to.
Most candidates preparing for a DBA technical round study Bash syntax. The interviewers are actually watching for something else: can you write a backup wrapper that stops cleanly when the dump fails? Can you schedule a maintenance job without causing overlap? Can you parse a noisy log file to find the signal that matters, and then alert on it without crying wolf at 3 a.m.? Those are the real shell scripting interview questions for DBAs, and they don't show up in generic Bash lists.
This guide covers the concepts, the scenarios, and the judgment calls that come up in DBA screening and technical rounds — with annotated examples, honest scoring advice for hiring managers, and a clear picture of what a strong answer actually sounds like.
What DBA Interviewers Are Really Probing with Shell Scripting
What Do You Actually Need Bash for in a DBA Job?
Shell scripting in a DBA role is glue work. It connects database tools — `pg_dump`, `mysqldump`, `expdp`, `sqlplus` — to the rest of the operating environment: cron schedules, log files, monitoring systems, and on-call alerts. The script itself is rarely the hard part. The hard part is making sure it handles the failure path as carefully as the success path.
Interviewers know this. When they ask about shell scripting, they're not testing whether you can write a for loop. They're testing whether you understand what "automating a backup" actually means in production: that the job ran, that the file landed in the right place, that the size is plausible, and that someone finds out immediately if any of that went wrong. The GNU Bash manual documents the mechanics; the interview is testing the judgment about when and how to apply them.
Why Generic Bash Answers Fall Flat Fast
Textbook Bash knowledge is genuinely useful. Knowing how to write a conditional, how to redirect output, how to use a here-doc — none of that is wasted. The problem is that a textbook answer stops at "write a loop that iterates over a list of databases." A DBA interview answer has to continue: what exit code does `pg_dump` return on failure? Does your script detect that, or does it continue to the next step and log a false success? What happens to the cron email if you redirect stderr to /dev/null?
The textbook version works fine in a classroom. It breaks in production the first time the database host is unreachable and the script reports success anyway.
Which Mistakes Make a Candidate Sound Book-Smart but Unusable?
The clearest tell is the service-check scenario. Ask a candidate: "How would you check whether a database is up every five minutes?" A book-smart answer checks whether the process is running with `ps aux | grep postgres`. A production answer checks whether the service actually responds — because a zombie process can show up in `ps` while the database is completely unresponsive. Then it handles the case where the check itself fails, avoids alerting on transient blips, and logs enough context for the on-call engineer to act without SSHing in first.
As one senior DBA who has run hiring panels at a large financial services firm put it: "I can teach someone `awk`. I can't teach them to think about what happens when the script succeeds but the database is still broken. That's the thing I'm listening for."
The Bash Basics DBAs Still Expect You to Know
Which Shell Are You Writing for, and Why Does the Shebang Matter?
The shebang line — `#!/bin/bash` versus `#!/bin/sh` — is not cosmetic. `/bin/sh` invokes the POSIX shell, which on some Linux distributions is `dash`, not Bash. If your script uses Bash-specific syntax like `[[ ]]`, process substitution, or arrays, it will fail silently or with a confusing error on systems where `/bin/sh` is not Bash. For DBA scripts that run on multiple database servers with different OS configurations, this distinction matters.
The safe default for DBA automation is `#!/bin/bash` with an explicit path, and if portability across systems is a real requirement, restrict yourself to POSIX-compliant syntax and test on the actual target shell. Interviewers who ask about shebangs are checking whether you know your scripts will run on the server they're deployed to, not just on your laptop.
How Do chmod +x, Execution, and Positional Parameters Show Up in Real Scripts?
Making a script executable is one command: `chmod +x backup.sh`. Running it with arguments is where the real mechanics show up. A backup script that takes a database name and a target directory as arguments is immediately more useful than one with hardcoded values:
The `$1`, `$2`, `$#` (argument count), and `$@` (all arguments) are the building blocks. Quoting them — always — is the difference between a script that works and one that breaks when a path contains a space. The interviewer's follow-up will almost certainly be: "What happens if `$BACKUP_DIR` is empty?" The answer should be: the script exits before it does any damage.
When Do Loops and Functions Make a Script Better Instead of Just Longer?
Loops and functions earn their place when you're doing the same operation across multiple databases. A script that backs up one database is straightforward. A script that backs up twelve databases in a loop, calls a shared function for the exit-status check, and logs each result separately is genuinely better — not because it's clever, but because fixing a bug in the check function fixes it in all twelve places at once.
The interviewer's probe here is usually about readability: "Would someone else be able to maintain this at 2 a.m.?" The honest answer is that a well-named function called `check_backup_success` is far more maintainable than the same logic copy-pasted twelve times, even if the function body is only five lines.
Answer Backup Automation Questions Like Someone Who Has Done the Job
How Would You Write a Backup Script That Proves Success, Not Just Starts a Job?
The structural answer interviewers want covers four things: run the command, capture the exit code, verify the output, and alert on failure. Starting a backup job is not the same as completing one, and a script that doesn't distinguish between them is a liability.
Note `${PIPESTATUS[0]}` — not `$?` — because piping through `gzip` would otherwise mask the `pg_dump` exit code. That detail is exactly what separates a candidate who has written production database automation scripts from one who has only read about them.
What Does a Good Answer Say About Cron Jobs and Timing?
A cron line is not an answer. The answer is everything around it: where does the output go, how do you know if the job didn't run, what happens if the previous run is still going when the next one starts, and is the backup window actually safe for the database load at that time?
Strong candidates mention the backup window explicitly — "we schedule the full dump at 2 a.m. because replication lag is lowest and user load is minimal" — and they mention logging cron output to a file rather than relying on cron's email delivery, which is unreliable in many environments. The cron documentation on Linux man pages covers the syntax; the interview is testing whether you've thought past the syntax.
How Do You Explain Restore Safety Without Sounding Theoretical?
The trap that catches candidates is the assumption that a completed backup is a usable backup. Interviewers who have worked in production know that backup files get corrupted, that `pg_dump` can exit 0 while writing a partial file, and that the only backup that matters is one you've actually restored from. The strong answer mentions restore testing explicitly — even if it's just a weekly restore to a staging instance with a row count check — because that's the operational standard, not the theoretical ideal.
PostgreSQL's backup and restore documentation makes this explicit: a backup is only as good as a verified restore. Saying that in an interview signals that you've thought about the full cycle, not just the dump command.
Explain Exit Codes, Traps, and Failure Propagation Without Hand-Waving
What Should Happen the Moment a Database Command Fails?
The clean failure chain is: capture the exit code immediately after the command, stop execution before the next unsafe step, log with enough context for someone to act, and alert. The mistake most candidates make is checking `$?` several lines after the command that set it — by then, another command has overwritten it.
Capture the exit code on the very next line. Don't rely on `set -e` alone — its behavior with pipes and subshells has enough edge cases that it should be a supplement, not a substitute for explicit checks.
Why Do Traps Matter More in Production Scripts Than in Demos?
A trap catches signals and script exits so you can clean up before the process disappears. In a demo, there's nothing to clean up. In a production maintenance script, there might be a lock file, a temp directory, a partial export, or an open database session that needs to be closed properly.
Without the trap, a script killed mid-flight leaves the lock file in place, and the next scheduled run sees the lock, assumes the previous job is still running, and skips — potentially indefinitely. That's a real production failure mode, not a theoretical one.
How Do You Stop a Script from Pretending Things Worked?
The most dangerous scripts are the ones that chain commands with `&&` or semicolons without checking intermediate results. A multi-step maintenance job — export, transform, load — can fail at step two and still report success if step three exits cleanly. The strong answer uses explicit exit-code checks at each step, exits with a non-zero code on failure, and never logs "completed successfully" unless every step actually completed successfully. Interviewers probe this with: "What does your monitoring system see if the export step fails but the load step doesn't run?" The answer should be: an alert, not a green checkbox.
Show That You Can Parse Logs and Command Output When Something Breaks
How Would You Find an Outage Signal Hiding in a Noisy Log?
Database service logs are verbose. The useful signal — connection failures, lock timeouts, replication errors — is buried in lines of routine activity. The practical approach is `grep` for the known error pattern, then `awk` or `cut` to extract the timestamp and relevant field, then a count threshold to distinguish a real spike from a transient blip.
The follow-up question is usually: "What if the error message format changes between versions?" The honest answer is that text parsing is inherently brittle, which is why structured logging — JSON log output from the database, parsed with `jq` — is preferable when the database supports it. But knowing how to grep effectively is still a required skill for the environments where structured output isn't available.
How Do You Detect Replication Lag or Space Pressure from Shell Output?
Replication lag and disk space are two of the most common operational checks DBAs automate. For disk space, `df -h` is readable but fragile to parse — `df --output=pcent,target` gives cleaner field selection. For replication lag, the check depends on the database: `pg_stat_replication` via `psql`, `SHOW SLAVE STATUS` via `mysql`, or a vendor-specific query. The shell script wraps the query, captures the output, and compares against a threshold.
The interviewer's probe is whether you avoid false positives — alerting when replication is intentionally paused, or when the replica has no load and the lag metric is null.
What Makes Output Parsing Robust Instead of Fragile?
Three things break output parsing: unquoted variables, locale-dependent formatting, and parsing output that was designed for humans rather than machines. Always quote variables. Use `LC_ALL=C` when parsing numeric or date output that could vary by locale. And when a machine-readable option exists — `--porcelain` in git, `--output` flags in system tools, JSON output from database clients — use it instead of parsing the human-formatted version. The GNU awk documentation covers field separation reliably; the judgment about when to use it versus a structured alternative is what the interview is actually testing.
What Makes a Shell Script Safe for Production Database Use
Why Do Quoting and Variable Handling Matter So Much Here?
An unquoted variable in a backup script is a ticking failure. If `$BACKUP_DIR` is empty — because an environment variable wasn't set, because a config file wasn't sourced, because a typo — then `rm -rf $BACKUP_DIR/` becomes `rm -rf /` on some shells. That's not a hypothetical. Always quote variables, always validate that required variables are set before using them, and always use `${VAR:?error message}` syntax for variables that must be non-empty.
This line exits immediately with a clear error message if `BACKUP_DIR` is unset or empty. That's one line of defensive code that prevents a catastrophic path error.
How Should a Script Handle Passwords, Environment Variables, and Secrets?
Environment variables are an acceptable pattern for credentials in controlled environments — they're how most database clients expect to receive passwords, and they avoid hardcoding secrets in script files. The risk is that environment variables can be visible in process listings (`ps e`) and can leak through subshells unexpectedly. A safer DBA answer uses a credentials file with restricted permissions (`chmod 600`) and sources it at the start of the script, or uses a secrets manager if the infrastructure supports one.
What's never acceptable in an interview answer: hardcoded passwords in the script body, credentials in the script's filename or log output, or passing passwords as command-line arguments where they appear in `ps aux`. Cron jobs in shell scripting that handle database credentials need the same security discipline as application code.
What Does Idempotent Actually Mean in a DBA Script?
Idempotent means you can run the script twice and the second run either produces the same result or exits safely without causing damage. For a deployment script, that means checking whether the schema change has already been applied before applying it. For a maintenance script, that means checking for a lock file before starting, and cleaning it up on exit. For a backup script, that means using a timestamp in the filename so a rerun creates a new file rather than overwriting the previous one.
Interviewers care about idempotency because production incidents often require rerunning scripts after a partial failure. A script that can't be safely rerun is a liability at exactly the moment you need it most.
The Shell Scripting Questions DBAs Actually Get Asked
How Would You Automate a Database Backup and Alert on Failure?
The complete answer covers the command wrapper, the exit-status check using `${PIPESTATUS[0]}` for piped commands, a timestamped log entry on both success and failure, and an alert path — typically `mail` or a webhook — that fires only on failure. The script should validate its inputs before running, use a lock file to prevent overlap, and clean up the lock file on exit via a trap. The interviewer wants to hear about the failure path before you describe the success path.
How Do You Check Whether a Database Service Is Up from Bash?
Three levels of depth here, and the interviewer is listening for which one you reach. First level: check the process with `ps aux | grep postgres`. Second level: check the port with `nc -z localhost 5432`. Third level: actually connect and run a trivial query — `psql -U monitor -c "SELECT 1"` — and check the exit code. Only the third level tells you the service is actually responding to queries. The first level can return true while the database is completely hung. Strong candidates explain all three and say which one they'd use in a production health check and why.
How Would You Parse a Log File to Find the Last Outage or Error Spike?
The defensible approach: `grep` for the known error pattern, pipe to `tail` for recency, and use `awk` to extract the timestamp for the alert message. If you need to count occurrences in a time window, `awk` with a timestamp comparison is cleaner than a multi-stage pipeline. The strong answer also mentions what you'd do when the log rotates mid-check — either use `logrotate`-aware tooling or check both the current and the rotated file.
How Do You Write a Script That Calls sqlplus, psql, or mysql Safely?
The key points: pass credentials via environment variables or a credentials file, never as command-line arguments. Capture the exit code explicitly — database clients don't always exit non-zero on SQL errors, so you may need to check the output for error strings as well. Use a here-doc or `-f` flag to pass SQL rather than inline strings that require escaping. And set a connection timeout so the script doesn't hang indefinitely waiting for a host that's down.
The interviewer's follow-up is usually about what happens if the SQL file contains an error partway through — does the script detect it, and does it roll back?
How Do You Schedule Routine Maintenance with Cron Without Causing Overlap?
Use a lock file. Check for it at the start of the script, exit if it exists and the owning process is still running, create it if not, and clean it up on exit via a trap. The `flock` command provides a cleaner mechanism on Linux systems that support it. The cron job itself should redirect both stdout and stderr to a log file — not to `/dev/null` — so you have a record of what happened. The interviewer is listening for whether you mention overlap as a real risk, because a maintenance job that runs for longer than its scheduled interval will eventually collide with itself and cause exactly the kind of database contention it was meant to prevent.
As one lead DBA at a major cloud services company described it: "The first thing I ask about is failure. If a candidate's answer to 'how would you schedule a backup?' starts with the cron line and ends with the dump command, they haven't thought about the job yet."
How Hiring Managers Should Score Practical Shell Scripting Skill
What Separates a Usable Answer from a Memorized One?
A memorized answer describes the script. A usable answer describes the job the script is doing and the failure modes it has to handle. The scoring lens is simple: does the candidate talk about what happens when the database command fails, or do they only describe the happy path? Do they mention logging, alerting, and exit codes without being prompted? Do they ask clarifying questions — "what database client, what OS, what alerting system?" — before diving into syntax? Those behaviors signal operational experience, not just Bash knowledge.
What Follow-Up Questions Expose Real Experience Fast?
The probes that separate depth from surface knowledge:
- "What happens if the backup directory doesn't exist when the script runs?"
- "How do you know the backup file isn't corrupt?"
- "What does the cron daemon do with the output if you don't redirect it?"
- "How do you prevent two instances of this script from running at the same time?"
- "Where do the database credentials live, and who can read that file?"
None of these require exotic Bash knowledge. They all require having actually run a backup job in production and dealt with what goes wrong.
What Should a Strong Practical Script Reveal About the Candidate?
The best scripts are boring in exactly the right way: readable variable names, explicit exit-code checks, a trap for cleanup, logging on both success and failure, and no clever one-liners that require a comment to explain. A hiring manager looking at a candidate's backup script should be able to understand it in two minutes and trust it in production. Syntax fireworks — complex parameter expansions, nested subshells, obscure `awk` idioms — are a yellow flag, not a green one. They signal that the candidate optimized for impressiveness rather than maintainability, which is the wrong optimization for a DBA automation script that someone else will have to debug at 3 a.m.
A short rubric: does the script handle failure gracefully (exit codes, traps, logging)? Is it safe to rerun after a partial failure (idempotency, lock files)? Would a colleague understand it without asking questions? If yes to all three, the candidate has demonstrated the operational judgment that actually matters.
How Verve AI Can Help You Prepare for Your Interview With Shell Scripting
The gap that trips most DBA candidates isn't knowledge — it's the live follow-up. You know how to write a backup wrapper. You know about exit codes. But when an interviewer asks "what happens if the dump fails halfway through a pipe?" in real time, the answer needs to come out clearly, specifically, and without the hedging that signals you're reconstructing it on the fly rather than recalling it from experience.
That's the problem Verve AI Interview Copilot is built to close. It listens in real-time to the live interview conversation and surfaces the specific, context-aware answer — not a generic definition, but the operational detail that fits what the interviewer actually asked. For shell scripting scenarios like backup automation, exit-code propagation, and cron overlap prevention, Verve AI Interview Copilot responds to what's being asked right now, not a canned prompt you preloaded. It stays invisible while it works, so the conversation stays natural. If you want to practice the follow-up sequences before the real interview — the "what if it fails here?" probes that expose whether your knowledge is real or rehearsed — Verve AI Interview Copilot runs mock interviews against exactly those scenarios. That's the preparation that turns a memorized answer into a confident one.
Conclusion
Shell scripting for DBAs is operational work, not syntax trivia. The questions in this guide are all variations on the same underlying test: do you understand the job well enough to automate it safely, handle its failure modes, and leave something behind that a colleague can maintain and trust?
Before your interview, practice three things specifically. Write a backup wrapper that checks the exit code, logs the result, and sends an alert on failure — then break it deliberately and see if it catches the failure. Write a log-parsing check that finds a specific error pattern in a database log and avoids false positives. Write a cron-scheduled maintenance script with a lock file and a trap. Run each one, break each one, and fix each one. That's the experience the interviewer is trying to find evidence of — and there's no shortcut to it that doesn't involve actually doing it.
Jordan Ellis
Interview Guidance

