AI doesn't read your careers site like a human does

Ask your LLM to summarise your careers page today. Ask it again on Wednesday. Then again next Friday.

If the three summaries diverge significantly, the bottleneck isn't the AI. It's your content - and you have a problem that's going to compound across 2026 and 2027 as candidates increasingly start their research with an AI agent rather than a Google search.

This isn't theoretical. The five lessons below (with a load more coming in a future post) come out of several weeks trying to build something that reads employer brand content the way a thoughtful human would. To make them land properly, I need to explain a bit of context first - including what an "AI agent reading your site" actually means in practice, and why I'm a little late shipping the tool that produced these findings.

The context (and why the product is a little late)

Some of you will have seen me share on LinkedIn about a product I've been building. It shows you what AI is currently learning about your employer brand - how a candidate-research AI agent reads your careers site, your job ads, and your culture pages, and what story it's telling on your behalf.

It was meant to launch by now. It hasn't, because I went back to the drawing board on how the engine actually reads and scores the companies we were seeding the database with.

The original approach was straightforward - scrape a careers site, hand the text to a scoring model, get a number back. The problem was that the numbers weren't reliable. The same content fed into the same engine on different days produced different results. Confident, but different. 🎭

That's the kind of thing you ship if you don't care about quality. If you don't care about adding real value. If you don't care about having real impact for the people and clients on the other end of it. I care deeply about all three.

I also lie awake at night worrying about a disgruntled senior leader tapping me on the shoulder and questioning how their company's rating was arrived at - whether the scrutiny behind it stands up under serious examination. It has to. The defensibility behind anything I put in front of clients and community has to be watertight, the due diligence deep, the methodology transparent. The approach has to reveal the reality of how the score landed, not hide behind "the AI did it". 🤷🏼

That's the core of my moral compass and how I think about this work. The moment there's a perception that something's being hidden, trust erodes. And once trust erodes, the damage is reputational and commercial - in that order, and both severe to me and my business. So the engine went back.

What's there now is a four-categories approach. The engine looks at your talent communication and employer brand across four kinds of content:

Employer narrative - the careers landing page, the "why work here" pitch, the headline story you tell about who you are as an employer.
Identity and principles - your values, your mission, your culture statements, the bits that articulate what the organisation stands for.
People and proof - employee stories, testimonials, named programmes, evidence that the principles play out in real life.
Job adverts - the actual roles you're hiring for, where the boilerplate around the role often says more about the company than the role description does.

Before any scoring happens, the engine has to find the content. It scrapes your site (back-end and front-end), classifies pages into the four categories, then breaks the text on each page into "chunks" - chunks of employer narrative, chunks of identity content, chunks of people-and-proof material, chunks of job-ad text. A "narrative chunk" is a paragraph or section that's been classified as employer narrative content. The engine then scores each chunk in context and aggregates back up.

If the scrape can't find content across all four categories, the engine retries with three. If it can't find three categories' worth of content, or the volume in any category falls below a quality threshold, the company gets rejected from the dataset and we move on to the next one. The rejection logic exists for a reason. A meaningful score needs meaningful content to score. If your careers site is 200 words of cookie consent and an "apply now" button, no AI assessment of it is honest, and no comparison against other companies built on it is either.

That's the engine in plain language. The five lessons below are are a small selection (with a shit-ton more coming soon) of what trying to make it work taught me about what AI is reading on your site - and about the content underneath it.

Enjoying this? I can pop the next one in your inbox if you'd like.

1. AI gives different answers to the same question on different days. The variance is itself a diagnostic.

We fed Monday.com's careers page to the same scoring engine three times in three days. No input changes. First run: one narrative chunk. Second run: zero narrative chunks (the run actually crashed because of it). Third run: three narrative chunks. Same prompt, same content, three different reads.

The variance isn't evenly distributed. It's concentrated on companies whose content is generic. Wise reads consistently across runs because Wise's content names specific programmes ("Mobile Wiser scheme: 90 days remote anywhere"), specific numbers ("125+ nationalities"), specific testimonials. There's nowhere for the AI to drift. Generic aspirational content - "supportive environment", "we believe in growth", "world-class team" - gives the AI maximum room to land somewhere different each visit.

So if you ever hear "AI is unreliable, so we don't need to worry about how we read to it", that's exactly backwards. The unreliability is the signal. AI is, indirectly, telling candidates which companies have invested in distinctive employer brand and which haven't. Specificity buys consistency. Generic content guarantees inconsistent perception.

2. AI scrapes everything on your page, including the chrome.

When a human visits a careers page, their eye filters out the cookie banner, the navigation menu, the footer, the "subscribe to our newsletter" pop-up, the privacy notice. You don't even register them.

AI doesn't filter.

Half of one fintech's careers landing page text - 32 chunks out of 64 - was chrome. Cookie consent. Footer links. Form labels. The AI has to actively distinguish substance from noise, and most candidate-research tools don't bother.

The downstream effect is that every cookie banner on your careers page (which says exactly the same thing as everyone else's cookie banner) becomes part of what the AI is reading about your company. It dilutes your distinctiveness score because it IS undistinctive. It's the same boilerplate every site uses. Half of your careers page is the same page as your competitor's careers page.

The fix is unglamorous. Semantic HTML, modest cookie banners, footers that don't repeat the entire sitemap, pages where the actual content sits in the first viewport rather than buried under 200 words of consent management. None of this is your dev team's problem first. It's yours.

3. Many enterprise careers sites are 169-word error messages.

I scraped two enterprise careers sites - one running on Workday, the other behind Akamai bot protection. We got 169 and 178 words back respectively. All of it variations of "please log in" and "we couldn't verify you're a human".

The AI scored those error messages as employer brand content. The score was meaningless. To a candidate using AI to research where to apply, both companies effectively don't exist.

Workday alone hosts an enormous chunk of enterprise careers. If you're a CHRO at a Fortune 500 and your job ads sit behind an authentication shell that returns thin generic text to anyone not already logged in, you're invisible to a candidate research process that increasingly happens in ChatGPT and Claude before it ever touches your site directly.

This isn't AI's failure. It's a known cost of how enterprise ATSes work, made newly visible because AI is now the candidate's first reader. Surface enough of each role publicly that a thin scrape returns real content. Title, location, first 300 words of the actual job. Anything is more than nothing.

4. The hard problems aren't AI problems. They're content problems.

Wise scored consistently well across our test runs. Why? Because the content is rich. 90+ pages of named employee stories. Specific programmes with specific names. Concrete numbers. A distinct voice across role descriptions. The AI doesn't have to guess what Wise is about. The content tells it.

A comparable-size global tech consultancy scored consistently low. Why? "World-class work environment". "Our people are the backbone". "We invest in our people". The AI scored it low because the content IS low. No amount of scraping cleverness will manufacture distinctiveness from boilerplate.

This is the lesson that would have been the post if I'd written it six months ago without building the tool. AI didn't create the problem of generic employer branding. It just made the cost of it visible at scale, in real-time, every time a candidate runs a query.

If your employer brand reads identically to twenty other companies', AI will confidently report it as such. The fix isn't better SEO. It isn't better metadata. It isn't a CMS upgrade. It's specific, named, distinctive content. Stories with names. Programmes with details. Numbers that aren't round.

5. Your job-ad boilerplate compounds whatever signal your wording carries.

ATSes often rotate which roles are visible at any given moment. I watched this happen on two of my test companies, where 4 of 5 visible roles changed between two visits. To a candidate's AI, that means the only constant across visits is your job-ad template. The intro paragraph. The benefits boilerplate. The "About Us" outro.

Structural templating is fine. It's expected. It's good practice. What matters is what's inside the template.

If your boilerplate says "We're a fast-paced, dynamic environment where every voice matters" across all 50 job ads, the AI sees "this company is generic" fifty times reinforced. If your boilerplate names specific programmes, real benefits, distinctive tone, the AI sees "this company is specific" fifty times reinforced. The rotation doesn't change your brand. It amplifies whatever signal your wording already carries.

Audit your JD template tomorrow. Pull any current job ad. Read the bits that aren't role-specific. Could they appear, unchanged, on three of your competitors' sites? If yes, you're spending fifty role rotations a month reinforcing something generic. The fix could take a day or two. (With a tool like The Talent Palette [cough*... Nudge, nudge. Wink, wink 💪🏼 😉]).

The broader point

AI is the spotlight, not the content. The probabilistic, inconsistent, junk-confident, cookie-banner-eating, auth-walled, boilerplate-amplifying way it reads your site doesn't change the underlying truth - candidates will increasingly research employers through AI agents in 2026 and 2027, and the cost of generic content compounds with every query.

This is exactly the gap I've been arguing about for years. Specific beats aspirational. Honest beats polished. Distinctive beats blandly appealing. AI just made the cost of getting it wrong measurable. 📉

My new tool is coming soon - with a robust report you can use to plan your strategy for the rest of 2026 and into 2027. A little late, because I'd rather ship something that survives serious scrutiny than something that doesn't.

More on that coming soon. Stay tuned. ☺️

AI doesn't read your careers site the way a human does. Five things you'd want to know.

The context (and why the product is a little late)

1. AI gives different answers to the same question on different days. The variance is itself a diagnostic.

2. AI scrapes everything on your page, including the chrome.

3. Many enterprise careers sites are 169-word error messages.

4. The hard problems aren't AI problems. They're content problems.

5. Your job-ad boilerplate compounds whatever signal your wording carries.

The broader point

Like what you're reading?

See the latest blog posts

The Signals AI Reads Most Are the Ones You Care About the Least

Your EVP Doesn't Need an AI Strategy. It Needs Honesty About What AI Is Actually Changing.

The AI Compression Problem Your EVP Isn't Ready For

Keep in touch to find out how I can help you

reveal your real

Want to discuss something?

contact me

Subscribe to the blog

AI doesn't read your careers site the way a human does. Five things you'd want to know.

The context (and why the product is a little late)

1. AI gives different answers to the same question on different days. The variance is itself a diagnostic.

2. AI scrapes everything on your page, including the chrome.

3. Many enterprise careers sites are 169-word error messages.

4. The hard problems aren't AI problems. They're content problems.

5. Your job-ad boilerplate compounds whatever signal your wording carries.

The broader point

Like what you're reading?

See the latest blog posts

The Signals AI Reads Most Are the Ones You Care About the Least

Your EVP Doesn't Need an AI Strategy. It Needs Honesty About What AI Is Actually Changing.

The AI Compression Problem Your EVP Isn't Ready For

Keep in touch to find out how I can help you

reveal your real

Want to discuss something?

contact me

Subscribe to blog