Stop Building a Portal. Build a Paved Road.
Internal developer portals are only valuable when they unlock self-service on the paved road. Here’s the playbook we use to ship it in two weeks—without inventing bespoke tooling you’ll regret.
“Your portal isn’t the product. The paved road is.”Back to all posts
The Portal That Never Shipped
I’ve watched a unicorn fintech spend nine months and seven figures building a “single pane of glass” portal. It had a gorgeous React shell, bespoke plugins, and a product manager who could demo the heck out of it. One problem: engineers still filed tickets to provision a database or request a secret. No paved road. No self-service. The portal was a pretty wiki.
We replaced it with something boring: Backstage with near-vanilla config, a handful of software templates, GitHub Actions with OIDC to cloud, Terraform modules for infra, and ArgoCD app-of-apps. Two weeks. By week three, 60% of new services came through the template, change lead time dropped 43%, and platform ticket volume fell by half. Did it have every nice-to-have? No. Did it ship? Absolutely.
The Actual Job: Self-Service on the Paved Road
Senior leaders don’t buy portals; they buy reduced lead time and fewer outages. A useful IDP lets a developer:
- Create a new service that compiles, builds, deploys, emits metrics, and has on-call ownership in under 30 minutes.
- Provision a database or queue with sane defaults, cost tags, and backups—without a ticket.
- Expose a service externally with TLS and auth, with guardrails baked in.
- Rotate secrets without waking up a platform engineer.
- Find owners, SLOs, runbooks, and deployment health in one click.
Measure success with DORA metrics and SRE basics:
- Lead time for changes (PR opened -> prod deployed)
- Change failure rate (failed deployments/total)
- MTTR (issue -> mitigation)
- Ticket volume related to platform/provisioning
Portals are just the UI for the paved road. If the underlying defaults and automation are weak, the portal is lipstick on a pig.
Defaults Over Bespoke: The Boring Stack That Wins
I’ve seen bespoke everything. It rots. Instead, lock in paved-road defaults:
- Languages: pick 2-3 (e.g.,
go,node,java). Ship templates withCODEOWNERS,renovate.json,Dockerfile,HelmChart,catalog-info.yaml. - CI/CD: GitHub Actions (or GitLab CI) with reusable workflows for build, test, SBOM, SAST, deploy.
- Infra: Terraform modules for common resources (
service,database,queue,bucket), with tags, budgets, encryption, and backups. - Deploy: ArgoCD + ApplicationSets (GitOps) with environment overlays.
- Observability: OpenTelemetry default exporters + Prometheus/Grafana dashboards scaffolded.
- Identity & RBAC: OIDC from GitHub Actions to cloud; RBAC by repo team -> namespace mapping.
A “golden path” should be boring to the point of mockery. That’s the point: fewer knobs, fewer footguns.
Backstage vs SaaS (Port/Cortex/OpsLevel): The Trade-offs
You can ship an IDP three ways:
- Backstage (Spotify OSS): Maximum control, massive plugin ecosystem, lower license cost, higher ownership cost. You own upgrades, plugins, and auth. Good fit if you can staff 1-2 ongoing FTEs.
- SaaS portals (Port, Cortex, OpsLevel): Faster time-to-value, great scorecards, integrations out-of-the-box, vendor cost instead of headcount. Less low-level control, but you get velocity and SLAs.
What’s worked in practice:
- If you have <8 platform engineers and no appetite for plugin maintenance, start with SaaS. You will ship faster and avoid yak shaving.
- If you already run Backstage via Roadie/Spotify templates or you need heavy customization (e.g., internal risk workflows), Backstage is fine—just keep it vanilla.
Cost reality I’ve seen:
- Backstage: 2 FTE to keep plugins/auth/templates current (~$400k/yr fully loaded) + infra. License: $0.
- SaaS: $80k–$250k/yr depending on seats/features. Minimal ongoing engineering (<0.25 FTE).
Pick your poison consciously. The business case is lead time and fewer tickets, not feature parity with Spotify.
A Minimal Self-Service Menu (With Real Configs)
Start with three buttons. Seriously.
- New Service (scaffold -> build -> deploy -> observability)
- Backstage software template (
template.yaml):
# templates/service/template.yaml
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
name: go-service
title: Go Service (Paved Road)
tags: [go, http, kubernetes]
spec:
owner: platform
type: service
parameters:
- title: Service Info
required: [name, owner]
properties:
name:
type: string
description: Repo/service name
owner:
type: string
description: Backstage group (e.g., team-payments)
steps:
- id: fetch
name: Fetch Template
action: fetch:template
input:
url: ./skeleton
values:
name: "${{ parameters.name }}"
owner: "${{ parameters.owner }}"
- id: publish
name: Create Repository
action: publish:github
input:
repoUrl: github.com/acme/${{ parameters.name }}
description: Paved-road Go service
defaultBranch: main
protectDefaultBranch: true
- id: register
name: Register in Catalog
action: catalog:register
input:
repoContentsUrl: ${{ steps.publish.output.repoContentsUrl }}
catalogInfoPath: "/catalog-info.yaml"catalog-info.yamlin the template skeleton:
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: ${name}
annotations:
github.com/project-slug: acme/${name}
backstage.io/techdocs-ref: dir:.
argocd/app-name: ${name}
spec:
type: service
lifecycle: production
owner: ${owner}- GitHub Actions deploy with OIDC to the cloud and GitOps handoff:
# .github/workflows/deploy.yml
name: deploy
on:
push:
branches: [main]
jobs:
build-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with: { go-version: '1.22' }
- run: go test ./...
- run: go build -o app
- uses: aquasecurity/trivy-action@0.20.0
with: { scan-type: fs, ignore-unfixed: true }
release:
needs: build-test
runs-on: ubuntu-latest
permissions:
id-token: write # for OIDC
contents: read
steps:
- uses: actions/checkout@v4
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build & Push
run: |
IMAGE=ghcr.io/acme/${{ github.repository }}:${{ github.sha }}
docker build -t $IMAGE .
docker push $IMAGE
- name: Update ArgoCD manifests
uses: imranismail/setup-kustomize@v2
- run: |
git clone https://github.com/acme/env-prod.git
cd env-prod/apps/${{ github.event.repository.name }}
kustomize edit set image service=$IMAGE
git config user.email ci@acme.com
git config user.name ci-bot
git commit -am "bump image $IMAGE"
git push- Provision a Database (Terraform module + policy checks)
- Terraform module usage (Cloud SQL/Aurora—pattern is similar):
module "db_orders" {
source = "git::https://github.com/acme/tf-modules.git//postgres?ref=v1.6.0"
name = var.service_name
env = var.env
storage_gb = 50
high_avail = true
retention_days = 7
tags = {
cost-center = var.cost_center
owner = var.owner
}
}- Policy gate with Conftest/OPA in CI:
conftest test terraform/ --policy policy/ --fail-on-warn- Expose a Service (ArgoCD ApplicationSet with sane defaults)
# env-prod/apps/applicationset.yaml
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: services
spec:
generators:
- git:
repoURL: https://github.com/acme/env-prod.git
revision: main
directories:
- path: apps/*
template:
metadata:
name: '{{path.basename}}'
spec:
project: default
source:
repoURL: https://github.com/acme/env-prod.git
targetRevision: main
path: '{{path}}'
destination:
server: https://kubernetes.default.svc
namespace: '{{path.basename}}'
syncPolicy:
automated: { prune: true, selfHeal: true }
syncOptions: [CreateNamespace=true]Start here. Ship it. Then add the fourth button (queue) and fifth (secret).
Wiring It Together Fast: A Two-Week Plan
I’ve done this a dozen times. The timeline that actually works:
- Day 1-2: Pick the paved-road defaults. Lock languages, CI, deploy, infra modules. Stand up identity: GitHub OIDC -> cloud provider; map repo teams to namespaces.
- Day 3-5: Backstage (or SaaS) base. For Backstage, start utterly simple:
# app-config.yaml (snippets)
app:
baseUrl: https://idp.acme.com
backend:
baseUrl: https://idp.acme.com
auth:
keys:
- secret: ${BACKEND_SECRET}
auth:
environment: production
providers:
github:
development:
clientId: ${GITHUB_CLIENT_ID}
clientSecret: ${GITHUB_CLIENT_SECRET}
integrations:
github:
- host: github.com
token: ${GITHUB_TOKEN}
catalog:
locations:
- type: github-discovery
target: https://github.com/acme/*/blob/main/catalog-info.yaml
techdocs:
builder: 'local'
generator:
runIn: 'local'
publisher:
type: 'local'- Day 6-7: Implement “New Service” template and repo skeletons. Bake in TechDocs (
mkdocs.yml),CODEOWNERS, Renovate, default dashboards. - Day 8-9: Terraform modules and
provision-infraworkflow.
# .github/workflows/provision-infra.yml
name: provision-infra
on: [workflow_dispatch]
jobs:
tf-apply:
runs-on: ubuntu-latest
permissions: { id-token: write, contents: read }
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- name: Auth to Cloud via OIDC
run: ./scripts/oidc_login.sh
- run: terraform init
- run: terraform plan -out tf.plan
- run: conftest test terraform/ --policy policy/
- run: terraform apply -auto-approve tf.plan- Day 10: ArgoCD app-of-apps and ApplicationSet; deploy a demo service end-to-end.
- Day 11-12: Add “Provision DB” template and policy checks.
- Day 13-14: Scorecards + SLOs + docs. Pilot with two teams, fix sharp edges.
Do less, better. Resist plugin rabbit holes.
Governance Without Becoming the Platform Police
Scorecards and SLOs beat nagging and tickets. Whether you use Backstage plugins (Roadie’s Scorecards, Spotify’s Checks) or SaaS (OpsLevel, Cortex), make compliance visible and fixable.
- Example scorecard (OpsLevel style):
# opslevel.yml
service:
name: orders
owner: team-payments
checks:
- name: Has On-Call
level: bronze
type: on_call
params: { provider: pagerduty }
- name: SLO Defined
level: silver
type: slo
params: { objective: 99.9, window_days: 30 }
- name: Security Scans in CI
level: silver
type: ci_job
params: { job: trivy }
- name: Prod Alerts < 5/wk
level: gold
type: alert_frequency
params: { threshold: 5 }- Service SLO scaffold (kept in repo so it’s code-reviewed):
# slo.yaml
service: orders
objective: 99.9
window: 30d
indicator:
type: latency
percentile: p95
threshold_ms: 400
alerts:
burn_rate:
fast: 14x over 2h
slow: 6x over 24hEnforce the hard lines (security, cost) with guardrails:
- OIDC + fine-grained IAM; no long-lived cloud keys in repos.
- Terraform policies (OPA/Conftest or Sentinel) for encryption, tags, size limits.
- Namespace quotas and network policies as defaults.
The key: put the requirements where engineers work (the portal and the repo), not in a Confluence page no one reads.
Results, Trade-offs, and Monday’s Plan
What we’ve actually seen after shipping the minimal portal + paved road:
- Lead time: 2-3 days -> under 24 hours for 70% of changes.
- Change failure rate: down 20-30% due to consistent pipelines and rollbacks.
- MTTR: down 30% with standard dashboards and on-call metadata in one place.
- Platform tickets: -40% to -60% (DB, service create, deploy exposures self-served).
- New service spin-up: from 1-2 weeks -> 30-60 minutes.
Trade-offs you must accept:
- You won’t support every language/runtime on day one.
- Some senior devs will want to tweak everything; keep a fast path and an escape hatch.
- Backstage plugin sprawl is real; freeze versions and add plugins deliberately.
What to do Monday:
- Write down your 5-7 high-frequency tasks. Rank by impact.
- Choose your default stack (CI, CD, IaC, observability) and make hard decisions.
- Stand up identity (OIDC) and a simple catalog (Backstage or SaaS discovery).
- Ship “New Service” + “Provision DB” + “Expose Service”.
- Add scorecards and SLOs. Instrument DORA metrics.
If you do just that in two weeks, your “portal” will already be paying for itself. The rest is iteration. And yes—GitPlumbers can help if you want someone who’s broken this stuff before and knows where the bodies are buried.
Key takeaways
- Your portal is not the product; the paved road is.
- Ship self-service for 5-7 high-frequency tasks first. Everything else can wait.
- Prefer opinionated defaults (templates, modules, workflows) over bespoke automation.
- Choose Backstage when you can staff ongoing plugin upkeep; otherwise use SaaS.
- Govern with scorecards and guardrails, not tickets and tribal knowledge.
Implementation checklist
- Define the 5-7 self-service actions engineers need weekly.
- Pick paved-road defaults: language templates, CI workflows, Terraform modules, deploy strategy.
- Wire identity and permissions early (OIDC + repo org + environment RBAC).
- Start with one delivery stack (e.g., GitHub Actions + ArgoCD + Terraform).
- Implement scorecards and SLOs before you scale templates.
- Measure before/after: lead time, change failure rate, MTTR, ticket volume.
Questions we hear from teams
- How do I avoid Backstage plugin hell?
- Freeze to a minimal plugin set, upgrade quarterly, and treat plugins like dependencies with owners and SLAs. Default to official/maintained plugins. If a plugin isn’t proving value in 30 days, remove it.
- What if teams refuse the paved road?
- Offer an escape hatch with documented support boundaries. Make the paved road faster and safer (fewer approvals, fewer outages). Most teams will self-select into less friction.
- Can I do this without Kubernetes?
- Yes. Swap ArgoCD for ECS CodeDeploy or Cloud Run + Terraform. The paved-road pattern—templates, workflows, modules—still applies.
- What about secrets and auth?
- Use cloud-native managers (AWS Secrets Manager, GCP Secret Manager) and inject at runtime. Authenticate CI with OIDC and short-lived tokens—no static keys. Bake secret rotation actions into the portal.
- Where do docs live?
- In the repo. TechDocs/MkDocs for service docs, linked from the catalog. No separate wiki for operational docs—keep it near the code and the people who own it.
Ready to modernize your codebase?
Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.
