jersey-mike / civilview-salesweb-scraper
Laravel 12 package that scrapes foreclosure/sheriff sale listings from salesweb.civilview.com.
Package info
github.com/jersey-mike/civilview-salesweb-scraper
Language:HTML
pkg:composer/jersey-mike/civilview-salesweb-scraper
Requires
- php: ^8.3
- illuminate/console: ^12.0
- illuminate/contracts: ^12.0
- illuminate/http: ^12.0
- illuminate/queue: ^12.0
- illuminate/support: ^12.0
- symfony/css-selector: ^7.0
- symfony/dom-crawler: ^7.0
Requires (Dev)
- orchestra/testbench: ^10.0
- pestphp/pest: ^3.0
- pestphp/pest-plugin-laravel: ^3.0
This package is auto-updated.
Last update: 2026-05-02 02:44:30 UTC
README
A Laravel 12 package that turns the public foreclosure / sheriff-sale listings on salesweb.civilview.com into clean PHP objects you can use, store, or pipe into your own systems.
Table of contents
- What this does
- Why you might want it
- How it works
- Installation
- Quick start
- Usage in depth
- The data model
- Cross-county schema variance
- The session quirk
- Artisan command
- Scheduling, queues, and events
- Configuration
- Testing
- Politeness and legal
- Troubleshooting
- License
What this does
salesweb.civilview.com is a Tyler
Technologies-hosted portal that ~50 U.S. counties use to publish their
upcoming foreclosure / sheriff-sale dockets — places like Philadelphia
County, PA; Burlington County, NJ; Maricopa County, AZ; Orleans Parish, LA;
and so on. Each county's page lists the open and recently-sold cases:
case number, sale date, plaintiff (typically a bank or mortgage servicer),
defendant, property address, and a "View Details" link to a richer page
showing the judgment amount, attorney of record, and an adjournment
history.
The site is public and read-only — there's no API, no login, and no robots restriction on the listings themselves. This package scrapes the HTML and returns plain readonly DTOs. It does not save anything to your database, mutate any state on the upstream site, or sign requests with credentials. It is a thin, well-tested adapter layer.
What you get back per scrape:
- A list of every county the portal serves, with their numeric IDs.
- Every sale on a county's docket — open or sold — with all the columns the county chose to expose, plus the property's detail-page URL and ID.
- For any property, the rich detail-page key/value pairs and full status history (every adjournment with its date and who requested it).
What you do with it is your problem (and the point): persist it to your schema, push it to a queue, diff it against last week's pull to detect changes, surface it in a dashboard, mail yourself when a specific plaintiff files, etc.
Why you might want it
A few real-world use cases:
- Real estate investors tracking new foreclosure inventory in target counties and getting alerted when properties matching certain criteria hit the docket.
- Title companies and attorneys monitoring filings in counties they practice in.
- Researchers and journalists studying foreclosure trends — which plaintiffs are most active, where, and how the volume changes month over month.
- Internal back-office tools at firms that already have to check this site daily — automating it means one less manual rotation.
If you've ever opened the Civilview site, ctrl-F'd through 1,500 rows looking for one plaintiff name, and thought "this should be an API," this package is that API.
How it works
The site is a server-rendered ASP.NET MVC app. There's no client-side rendering, so there's no JS engine required — a plain HTTP client and an HTML parser do the job.
┌────────────────────────────────────┐
│ CivilviewClient (the public API) │
└──┬──────────┬──────────┬───────────┘
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌──────────┐ ┌──────────────┐
│ HttpClient │ │ Scrapers │ │ DTOs │
│ (Laravel │ │ (DOM- │ │ County │
│ Http + │ │ Crawler) │ │ Sale │
│ Guzzle) │ │ │ │ SaleDetail │
└─────────────┘ └──────────┘ └──────────────┘
│ │ ▲
▼ ▼ │
HTTPS GET/POST → salesweb.civilview.com
│ │
└──────────┘
raw HTML response
Three high-level moving parts:
HttpClient— A thin wrapper around Laravel'sHttpfacade plus Guzzle's cookie jar. Handles the base URL, user agent, timeouts, and retries (configurable inconfig/civilview.php). Crucially, it can produce a session-aware request handle whose cookies survive across calls — needed because the site keeps state in ASP.NET sessions (see The session quirk).- Scrapers — Three small classes, one per page type
(
CountyDirectoryScraper,SalesListScraper,SaleDetailScraper), each usingSymfony\Component\DomCrawlerwith CSS selectors. They take raw HTML and produce DTOs. The scraping logic is intentionally defensive: column counts and labels vary between counties, and the scrapers cope. CivilviewClient— The orchestration layer. Picks the right HTTP request shape (GET vs. POST, cold vs. session-bound), routes the response to the right scraper, and returns a typed result. This is the class you actually call (usually via theCivilviewfacade).
A queueable job, an Artisan command, and two events sit on top for ergonomics, but the core is just those three pieces.
Installation
Requirements:
- PHP
^8.3 - Laravel
^12.0
Install via Composer:
composer require jersey-mike/civilview-salesweb-scraper
Laravel's package discovery picks up the service provider and the
Civilview facade automatically — no config/app.php edits.
If you want to override the defaults (base URL, user agent, timeouts, politeness delay), publish the config file:
php artisan vendor:publish --tag=civilview-config
That puts a config/civilview.php in your app where you can tweak things
or pull values from .env.
Quick start
use JerseyMike\Civilview\Facades\Civilview; // Discover what counties exist $counties = Civilview::counties(); $counties->where('state', 'NJ')->pluck('name'); // Collection: ["Atlantic County", "Bergen County", "Burlington County", ...] // Pull every active foreclosure case in Burlington County, NJ $sales = Civilview::sales(countyId: 3); $sales->count(); // 89 $sales->first()->plaintiff(); // "WELLS FARGO BANK, N.A." $sales->first()->address(); // "5 PEAR AVENUE BROWNS MILLS NJ 08015" $sales->first()->saleDate(); // "5/7/2026"
Three lines of code, ninety property records.
Usage in depth
All examples assume use JerseyMike\Civilview\Facades\Civilview; at the top
of the file. If you'd rather inject the client (for testing or DI
preferences), type-hint JerseyMike\Civilview\CivilviewClient instead —
the facade is just sugar over a singleton binding.
Listing counties
The Civilview portal's homepage doubles as a county directory. We scrape it once and give you the list as DTOs:
$counties = Civilview::counties(); foreach ($counties as $county) { $county->id; // 3 $county->name; // "Burlington County" $county->state; // "NJ" }
About 47 counties are returned at the time of writing, spanning AZ, CO,
DE, FL, IA, ID, IL, KS, LA, MN, NJ, OH, OR, PA, TX, and WA. Some Texas
counties have multiple sub-jurisdictions (Constable Precincts 1–5,
Sheriff's Office) — those come back as separate County rows with
distinct IDs and a descriptive name.
Listing sales for a county
Pass a countyId from the directory:
$sales = Civilview::sales(60); // Philadelphia County, PA → ~1,600 rows
The result is a Laravel Collection of Sale DTOs. Each has:
countyId— echoed back so you can group/filter downstreampropertyId— the upstream identifier (used to fetch the detail page)detailsUrl— the relative path on civilview.comattributes— an associative array[columnHeader => cellValue]preserving exactly what the table showed
Convenience accessors normalize the most common columns:
$sale->plaintiff(); // "WELLS FARGO BANK, N.A." $sale->defendant(); // "KENNETH YAUGER; ..." $sale->address(); // "423 EAST 9TH STREET FLORENCE NJ 08518" $sale->saleDate(); // "4/30/2026" $sale->sheriffNumber(); // "25001681" (NJ counties) $sale->bookAndWrit(); // "2308-381" (Philadelphia) $sale->opaNumber(); // "1313111130" (Philadelphia)
These accessors are case-insensitive and return null when the column
isn't present for that county. For anything county-specific, fall back to
$sale->get('Some Custom Header') or $sale->attributes.
Filtering
Build a SalesFilter and pass it as the second argument:
use JerseyMike\Civilview\Data\SalesFilter; $wellsFargoOpen = Civilview::sales(3, new SalesFilter( isOpen: true, // open vs. sold/cancelled plaintiff: 'WELLS FARGO', // partial match, the site does the filtering defendant: null, address: null, city: null, salesDate: '5/7/2026', // a specific sale date — exact match monthNumber: null, // or 1–12 for "all of May" sheriffNumber: null, ));
Mechanically: the upstream site stores its countyId in the session and
expects a POST to /Sales/SalesSearch with form fields like
PlaintiffTitle, IsOpen, PropertyStatusDate, etc. The package handles
both — the GET-then-POST dance and the field name mapping — so you stay in
PHP-land.
A real session output:
php artisan civilview:scrape --county=3 --filter-status=open --filter-plaintiff="WELLS FARGO" # → 6 rows
Fetching a sale's detail page
$detail = Civilview::saleDetails(propertyId: 1950595109, countyId: 3); $detail->propertyId; // 1950595109 $detail->get('Plaintiff'); // "NEW JERSEY HOUSING AND MORTGAGE FINANCE AGENCY" $detail->get('Approx. Judgment'); // "$185,432.10" $detail->get('Attorney'); // "STERN & EISENBERG, PC" $detail->statusHistory; // [ // ['status' => 'Scheduled for', 'date' => '10/23/2025'], // ['status' => '1st Debtor Adjournment to', 'date' => '11/20/2025'], // ['status' => '1st Attorney Adjournment to', 'date' => '1/8/2026'], // ['status' => '2nd Attorney Adjournment to', 'date' => '2/5/2026'], // ['status' => 'Adjourned by Court to', 'date' => '3/19/2026'], // ['status' => 'Adjourned by Court to', 'date' => '5/7/2026'], // ]
Why the second argument matters — the Civilview detail endpoint
requires a session that has previously visited a SalesSearch for a
matching county. We seat that session for you when you pass countyId. If
you don't, you'll get an empty attributes array and a redirect to the
homepage. See The session quirk.
Listing + detail in one call
$detailed = Civilview::salesWithDetails(3); // → Collection<SaleDetail>
Internally:
- One scrape of the list page (1 HTTP request).
- One reused session for the whole batch (1 HTTP request to seat it).
- N detail-page GETs, one per sale, with a small
usleepbetween each (configurable; default 250 ms) to avoid hammering the server.
For a 90-row Burlington pull, that's ~92 requests over ~25 seconds. For 1,600-row Philadelphia, plan for several minutes — push it to a queue.
The data model
Three readonly DTOs, all with toArray() for JSON encoding:
County
final readonly class County { public int $id; public string $name; public ?string $state; // null for jurisdictions outside the US heading parser }
Sale
final readonly class Sale { public int $countyId; public int $propertyId; public string $detailsUrl; // e.g. "/Sales/SaleDetails?PropertyId=1950595109" public array $attributes; // ["Plaintiff" => "...", "Address" => "...", ...] // Case-insensitive convenience accessors. Return null when missing. public function plaintiff(): ?string; public function defendant(): ?string; public function address(): ?string; public function saleDate(): ?string; public function sheriffNumber(): ?string; public function bookAndWrit(): ?string; public function opaNumber(): ?string; public function get(string $key): ?string; // for anything else }
SaleDetail
final readonly class SaleDetail { public int $propertyId; public array $attributes; // every key/value pair on the detail page public array $statusHistory; // [['status' => '...', 'date' => '...'], ...] public function get(string $key): ?string; }
The DTOs are deliberately dumb — no Eloquent, no traits, no ORM. They serialize cleanly to JSON, are safe to ship across queues, and can be mapped into whatever schema your application uses.
Cross-county schema variance
Different counties expose different columns. This is the single most important thing to understand if you're storing the data.
Burlington County, NJ (countyId=3):
| Sheriff # | Sales Date | Plaintiff | Defendant | Address |
|---|
Philadelphia County, PA (countyId=60):
| Book & Writ | OPA # | Address | Plaintiff |
|---|
Notice: Burlington has a Sales Date column and a Defendant column;
Philadelphia has neither. Philadelphia has OPA # (the Office of Property
Assessment number — a Philly-specific tax ID); Burlington has no
equivalent.
The package handles this by never assuming a fixed schema. The Sale
DTO carries an attributes map whose keys come straight from the table's
<th> text. The convenience accessors do case-insensitive lookups and
fall back to null when a column isn't there.
What this means for you:
- If you're storing to a relational DB, a
JSON/JSONBcolumn forattributesis the path of least friction. You can also pluck fixed columns out (plaintiff, address, sale_date) into typed columns and leave the rest in a JSON blob. - If you only care about a few counties, you can hard-code the columns you expect after one quick exploration call.
- If you're aggregating across counties, do not assume any column
exists. Use the convenience accessors (which return
nullgracefully) and design your reports around what's actually populated.
The session quirk
This is the one piece of weirdness in the upstream site, and the most likely source of confusion if you go off-script.
The detail endpoint, GET /Sales/SaleDetails?PropertyId=NNNN, looks
self-sufficient — the URL contains a property ID, you'd expect a row
back. But the controller looks up the matching countyId from the
ASP.NET session, not the request. If the session has no countyId (or one
that doesn't match this property), the response is a 302 to the
homepage — sometimes via /Home/Index?aspxerrorpath=/Sales/SaleDetails,
which makes it look like a server-side exception when it's really an
auth-style guard.
Demonstrated:
# Cold call — no session curl -i "https://salesweb.civilview.com/Sales/SaleDetails?PropertyId=1950595109" # → HTTP/2 302 # → location: / # With a session that has visited SalesSearch?countyId=3 first curl -c jar -o /dev/null "https://salesweb.civilview.com/Sales/SalesSearch?countyId=3" curl -b jar "https://salesweb.civilview.com/Sales/SaleDetails?PropertyId=1950595109" # → HTTP/2 200, ~10 KB of HTML
The package handles this for you — saleDetails($propertyId, $countyId)
seats the session, and salesWithDetails() reuses one session across the
whole batch (so the seating cost is paid once, not N times). The same
applies to filtered list searches: filtered POSTs to /Sales/SalesSearch
also need the session-stored countyId, and the package issues the seating
GET first.
The only failure mode you can run into: calling saleDetails($propertyId)
without a countyId. The package keeps that signature for backward
compatibility / advanced cases where you've already seated the session
yourself, but in normal use, always pass the countyId you got back
on the corresponding Sale.
Artisan command
The package ships a CLI tool useful for ad-hoc queries, JSON exports, and sanity-checking that the upstream site hasn't changed shape.
# List every county and its ID php artisan civilview:scrape # Pull all sales for a county (formatted table) php artisan civilview:scrape --county=3 # JSON output (pipe to jq, redirect to a file) php artisan civilview:scrape --county=3 --json > burlington.json # Apply filters php artisan civilview:scrape --county=3 \ --filter-status=open \ --filter-plaintiff="WELLS FARGO" \ --filter-date="5/7/2026" \ --json # Include each sale's detail page (slow — be patient) php artisan civilview:scrape --county=3 --with-details --json
Available flags:
| Flag | Description |
|---|---|
--county=ID |
County ID to scrape. Omit to list available counties. |
--with-details |
Also fetch each sale's detail page. |
--filter-status=open|sold |
Filter by docket status. |
--filter-plaintiff=NAME |
Plaintiff substring filter. |
--filter-defendant=NAME |
Defendant substring filter. |
--filter-date=MM/DD/YYYY |
Specific sale date. |
--filter-city=NAME |
City filter (only meaningful for counties that expose it). |
--json |
Emit JSON instead of an ASCII table. |
Scheduling, queues, and events
For periodic scraping, dispatch the job from your scheduler:
use JerseyMike\Civilview\Jobs\ScrapeCountyJob; use JerseyMike\Civilview\Data\SalesFilter; // routes/console.php (Laravel 12) or app/Console/Kernel.php Schedule::call(function () { foreach ([3, 6, 19, 60] as $countyId) { ScrapeCountyJob::dispatch( countyId: $countyId, filter: new SalesFilter(isOpen: true), withDetails: false, ); } })->dailyAt('06:00');
The job fires events you can listen for to do the actual persistence:
use JerseyMike\Civilview\Events\CountyScraped; use JerseyMike\Civilview\Events\SaleDetailScraped; // app/Providers/AppServiceProvider.php (boot method) Event::listen(function (CountyScraped $event) { $event->countyId; // 3 $event->sales; // Collection<Sale> foreach ($event->sales as $sale) { ForeclosureSale::updateOrCreate( ['property_id' => $sale->propertyId], [ 'county_id' => $sale->countyId, 'plaintiff' => $sale->plaintiff(), 'address' => $sale->address(), 'sale_date' => $sale->saleDate(), 'attributes' => $sale->attributes, // JSON column 'last_seen_at'=> now(), ], ); } }); Event::listen(function (SaleDetailScraped $event) { // Only fires when the job was dispatched with withDetails: true $event->detail; // SaleDetail });
This event-driven design is intentional. The package never imposes a schema on you, but gives you a single hook where persistence belongs. Your tests can fake the events, your listeners can be queued for further async work, and swapping persistence implementations doesn't touch anything in this package.
Configuration
config/civilview.php (after php artisan vendor:publish):
| Key | Default | Notes |
|---|---|---|
base_url |
https://salesweb.civilview.com |
Override only for testing/proxy. |
user_agent |
Mozilla/5.0 (compatible; civilview-salesweb-scraper/1.0) |
Set a contact URL/email here when scraping at volume — it's polite and makes you debuggable to the site operators. |
timeout |
30 |
Per-request seconds. |
retry_times |
3 |
Wraps Http::retry(). The site occasionally returns transient 5xxs. |
retry_sleep_ms |
200 |
Backoff between retries. |
detail_delay_ms |
250 |
Pause between consecutive detail fetches in salesWithDetails(). |
All of these can be backed by .env variables (e.g.
CIVILVIEW_DETAIL_DELAY_MS=500) — see the published config for the env
keys.
Testing
The package's own test suite:
composer install vendor/bin/pest
14 tests, all offline — they parse against captured HTML fixtures in
tests/Fixtures/ so they're deterministic and don't touch the network.
The fixtures cover both schema variants (Burlington's Sheriff#/Defendant
shape and Philadelphia's Book & Writ/OPA shape), the homepage, and a
synthetic detail page.
For your own application's tests, fake the HTTP layer:
use Illuminate\Support\Facades\Http; Http::fake([ 'salesweb.civilview.com/Sales/SalesSearch*' => Http::response(file_get_contents(__DIR__ . '/fixtures/burlington.html')), ]); $sales = Civilview::sales(3); expect($sales)->toHaveCount(89);
Or fake the events:
Event::fake(); ScrapeCountyJob::dispatchSync(3); Event::assertDispatched(CountyScraped::class);
For a smoke-test against the real site, the package ships an Orchestra Testbench so you can run the Artisan command without a host app:
vendor/bin/testbench civilview:scrape --county=3 --json | jq '.[0]'
Politeness and legal
This is a public government portal, but you're still hitting someone else's servers. A few rules of the road:
- Don't hammer. Keep
detail_delay_msreasonable (the default 250 ms is a sensible floor). Don't run twenty-county scrapes in parallel from one IP. - Identify yourself. Set a descriptive
user_agentwith a contact URL or email. If you cause a problem, the operators can email you instead of just blocking your IP. - Cache aggressively. The data updates daily at most. Re-pulling the same county every five minutes is wasteful and rude.
- Respect changes. If the site breaks or starts returning errors, back off — don't retry-loop indefinitely.
- The data is public, but how you use it isn't always. Some jurisdictions have rules about how foreclosure data can be republished or aggregated commercially. If you're building a product on top of this, talk to a lawyer.
This package is provided as-is and doesn't grant you any rights to the underlying data. Tyler Technologies (the platform vendor) and the counties themselves are the source of truth. The author of this package is unaffiliated with both.
Troubleshooting
"Civilview::saleDetails() returns empty attributes."
You called it without a countyId, or with a wrong one. Pass
countyId: $sale->countyId from the corresponding Sale.
"My filter returns the same rows as no filter."
Most likely you used a column the county doesn't expose (e.g.
SalesFilter(defendant: '...') against Philadelphia, which has no
defendant column). The site silently ignores unknown filters. Check the
column list with an unfiltered scrape first.
"I'm getting 5xx errors from the upstream."
The package retries 3 times by default. If errors persist, the site is
down — give it a few minutes. You can bump retry_times and
retry_sleep_ms in config, but more retries won't fix a server that's
genuinely offline.
"My PestPHP tests can't find the package classes."
Run composer dump-autoload after pulling new code. The PSR-4 namespace
is JerseyMike\Civilview\… (StudlyCase, no hyphens).
"The number of rows changed unexpectedly." The dockets are live. Open cases get adjourned, sold, or cancelled constantly. If you need a stable snapshot, capture and timestamp your own copy.
"A column I rely on disappeared."
Counties occasionally rearrange their schemas (it's their portal, not
ours). Use $sale->get('Header Name') for tolerant access, log
unexpected nulls, and don't hard-code column orders.
License
MIT. See LICENSE.md.