How do I find all email addresses on a webpage?

onHover's Email Finder scans visible text (innerText and textContent), mailto links, JSON-LD schema data, other script tags, and data-email attributes. It strips zero-width Unicode characters that sites inject to break email scrapers, and groups results by domain with their source type.

How do I export web data to Google Sheets?

Export as CSV from onHover's Table Capture, then import into Google Sheets via File → Import. For recurring imports from public pages, Google Sheets' IMPORTHTML function can pull tables directly: =IMPORTHTML('https://example.com', 'table', 1).

Extract Emails, Tables & Pages in One Click

Q: How do I extract all emails from a webpage?

Right-click, View Source, and Ctrl+F for '@' finds emails in raw HTML but misses JavaScript-rendered content. onHover's Email Finder strips obfuscation characters, scans visible text, script tags, mailto links, JSON-LD schema, and data attributes for comprehensive extraction including dynamically rendered email addresses.

Q: How do I convert a webpage to Markdown?

Chrome has no built-in webpage-to-Markdown converter. onHover's Page Export converts the current page's content to Markdown by parsing the DOM and translating headings, paragraphs, links, images, lists, code blocks, and tables — useful for feeding page content to AI tools and creating documentation.

Q: What is the best Chrome extension for scraping data from websites?

For structured data extraction without coding, onHover extracts emails, tables, and full-page content (Markdown, HTML, PDF, Word). For more complex structured scraping across multiple pages, Instant Data Scraper and Web Scraper offer visual point-and-click selectors. For programmatic scraping, Puppeteer and Playwright provide full automation.

Q: How do I save a webpage as a Word document?

Chrome's Save As saves pages as HTML or web archive, not Word documents. onHover's Page Export converts the current page content to a .docx Word file preserving headings, paragraphs, links, and images — useful for creating editable documentation or sharing content with stakeholders who prefer Word.

You've spent 45 minutes copy-pasting rows from a competitor's pricing table into a spreadsheet. Then you manually transcribed their team page contact info. Then you saved their blog post as a PDF that looks like it was designed by a fax machine. There's a better way: onHover's Extract & Export tool — a Chrome extension feature that lets you extract data from websites in the format you actually need — tables as CSV, articles as Markdown, pages as clean PDF. One click, right format, done.

What you can extract

Email addressesScan the full page DOM for all mailto links and plaintext emails

Tables → CSVConvert any HTML table to a downloadable spreadsheet

Page → MarkdownExport the main content as clean, editor-ready Markdown

Page → HTMLSave the rendered DOM snapshot as a portable HTML file

Page → PDFPrint-to-PDF the current viewport with proper pagination

Page → WordExport as a .docx file for sharing with non-technical stakeholders

Finding email addresses on websites

Open the Extract & Export tab and hit "Find Emails." The email finder scans the full rendered document — including dynamically loaded content — and lists every email address it finds, deduplicated and sortable.

Three situations where this saves real time:

Building a contact list from a conference speaker directory or event site
Auditing your own pages for exposed email addresses that attract spam bots
Gathering contributor emails from an open source project's contributors page

Convert webpage tables to CSV

Any <table> on the page gets a download button in the viewer. This export HTML table to CSV capability works on pricing comparison tables, leaderboards, analytics dashboards that render HTML tables, and government data portals that provide no official export option. Column headers are preserved, and merged cells are handled correctly.

Works on paginated tables

The CSV export captures whatever is currently rendered in the DOM. For paginated tables, use the page's "show all" option first if it has one, then export — no manual copy-paste for each page of results.

Page to Markdown conversion

The Markdown export runs a semantic pass on the page — extracts the primary content block, converts headings, lists, links, and code blocks to proper Markdown, and strips navigation, ads, and sidebar boilerplate. What you get is a clean, portable document.

Particularly useful for:

Saving documentation pages to Obsidian, Notion, or Bear without the junk
Turning a blog post into an editable draft for repurposing
Creating a clean text version of an article to feed into an LLM context window

Page to PDF and Word export

The PDF export applies print-specific CSS that hides headers, footers, navbars, and sidebar clutter before rendering. The result actually looks like a document — not a browser screenshot. The Word export (.docx) is the right choice when you need to hand off web content to someone who'll edit it further — client reports, stakeholder summaries, content workflows where Markdown isn't an option.

A real workflow example

You're doing a competitor content audit. For each blog post you find interesting: extract to Markdown, paste into your notes. For their pricing page: export the comparison table as CSV. For their team page: pull the email addresses. An afternoon of competitive research that used to mean hours of copy-paste now takes 20 minutes. The data was already structured on the page — you just needed a tool that respects that.

Why copy-paste fails at scale

Copy-paste works for one thing. You see a piece of data, you select it, you paste it somewhere. That's fine for a single value. It completely falls apart the moment you need more than a handful of items from a structured source.

Try copy-pasting a 60-row pricing table. You paste it and get a wall of text with no column separation, because the clipboard doesn't carry table structure. Now you spend time manually adding tabs or commas, reconstructing column relationships from memory. Or try copy-pasting a list of 40 speaker names and emails from a conference page — you're selecting each one individually, tabbing to your spreadsheet, pasting, tabbing back, repeat. After the first ten you're making mistakes.

The structural data is already there in the HTML. The browser has already parsed it into a proper DOM tree with <tr>, <td>, href attributes, heading hierarchy — all of it. Copy-paste throws that structure away and gives you plain text. A browser extension reads from the rendered DOM directly, which means it gets the structure for free. That's the fundamental advantage: you're not fighting the clipboard, you're working with what the browser already knows.

External scraping tools have the opposite problem. They re-fetch the page from scratch, which means they have to handle authentication again, they miss dynamically-rendered content (anything loaded by JavaScript after the initial HTML), and they can't see what you're actually looking at — they see what the server sends, which is often different from what your authenticated, logged-in session renders. The extension is already inside the rendering context. It sees exactly what you see.

JSON export for developers

CSV is the right format when the destination is a spreadsheet or a non-technical audience. For developers, JSON is usually better — it preserves hierarchy, it's directly consumable by code, and it doesn't flatten nested structure into something you then have to reconstruct.

When you export a page as JSON from the onHover developer toolkit, here's what you get: the heading tree as a nested structure (h1 containing its child h2s, each h2 containing its child h3s), all links with their anchor text and href, all images with their src and alt text, and any tables as nested arrays of row arrays. The email extraction output is a flat array of strings — one address per entry — which pastes directly into a tool that expects one-per-line input.

JSON export structure

{

"headings": [{ "level": 1, "text": "Page Title", "children": [...] }],

"links": [{ "text": "anchor text", "href": "https://..." }],

"images": [{ "src": "...", "alt": "..." }],

"tables": [["Col A", "Col B"], ["row1a", "row1b"]],

"emails": ["[email protected]", ...]

}

This format drops straight into a script. If you're building a competitive analysis tool, an internal knowledge base, or a monitoring script that tracks page structure changes — you don't want to write a parser for CSV. JSON is already the shape your code expects.

Using Markdown export for research notes

The Markdown export does something most page-save tools don't: it reads the page semantically instead of just dumping HTML. It identifies the main content block — the article body, the documentation section, the primary text — and ignores everything that isn't content. No nav links. No footer boilerplate. No cookie banner text that got included because it was technically in the DOM.

Headings become #, ##, ###. Lists become proper Markdown bullet or numbered lists. Links stay intact with their anchor text. Code blocks get fenced with triple backticks. The output pastes cleanly into Obsidian, Notion, Bear, Logseq — any note-taking tool that renders Markdown — and looks like a real document.

One use case we didn't anticipate when building this: Markdown as LLM context input. Raw HTML is a terrible context window filler — it's 60% tags and attributes that the model has to parse before getting to actual content. Clean Markdown is much denser with actual information per token. If you're doing research and want to feed a page into a Claude or GPT conversation, exporting to Markdown first gives you significantly better results than pasting raw HTML or trying to summarize manually.

Markdown length vs HTML length

A typical documentation page exported as raw HTML runs 40,000–80,000 characters. The same page as clean Markdown is usually 5,000–15,000 characters. That's 5–8x more content you can fit into a fixed context window — and the model spends its attention on actual words, not tag attributes.

A note on responsible use of email extraction

Email extraction is one of those features that has genuinely useful applications and also obvious misuse potential. We want to be straightforward about where we stand on this.

The legitimate cases are real. Auditing your own site for exposed addresses that bots can harvest. Compiling a contact list from a conference speaker directory where all the information is publicly posted by the organizers. Collecting contributor emails from an open source project's GitHub contributors page when you're reaching out about a security disclosure. These are things developers actually need to do.

The tool works on a per-page, per-use basis. You visit a page, you run the extraction, you get the emails on that page. There's no crawling, no bulk scraping across thousands of pages, no data stored outside your current browser session. When you close the panel, the results are gone. This is an intentional design constraint, not a limitation we're planning to remove.

Using extracted emails for unsolicited mass outreach isn't a use case we support. Beyond being generally unwelcome, it tends to be illegal in most jurisdictions under CAN-SPAM, GDPR, or equivalent legislation. The tool works per-page because that maps to the legitimate use cases. If your use case requires bulk collection at scale, this isn't the right tool — and honestly, you should think carefully about whether the use case is one you want to pursue.

Why browser extensions win for data extraction

Desktop scraping apps and server-side crawlers have a structural problem: they re-fetch the page. They send an HTTP request, get back HTML, and parse that. For static pages with no authentication, this is fine. For the modern web, it misses most of what matters.

A significant portion of pages that contain interesting data are JavaScript-rendered. The initial HTML response is essentially an empty shell — a script tag and a root div. The actual content populates after the JS runs. A server-side scraper sees the shell. A browser extension sees the finished page, because it runs after the browser has already executed all the JavaScript and rendered the full DOM.

Authentication is the other issue. You're logged into a SaaS dashboard. The data you want is behind that login. A desktop app has to handle your session credentials, manage cookies, potentially solve CAPTCHA challenges — it's a whole separate authentication layer to build and maintain. The browser extension runs in your existing authenticated session. You're already logged in. The extension reads the DOM you're already looking at. No credential management, no session handling, no re-authentication.

This is why we built the extraction features in onHover as a Chrome extension rather than as a standalone tool. The browser is doing the hard work — rendering, JavaScript execution, authentication — and we're just reading the result. For structured data extraction from the modern web, that architecture is genuinely better than alternatives that try to replicate the browser's work from the outside.

Frequently asked questions

How do I extract all emails from a webpage?

Right-click, View Source, then Ctrl+F and search for '@' — this finds email addresses in the raw HTML but misses JavaScript-rendered content. For comprehensive extraction including dynamically rendered text, script tags, mailto links, JSON-LD schema data, and data attributes, onHover's Email Finder scans all visible content sources and groups results by domain with their source type.

How do I export an HTML table to CSV or Excel?

Most HTML tables can be selected, copied, and pasted into Excel — Excel recognizes table structure when pasting. For tables that don't copy cleanly (merged cells, complex headers), or for tables where you want a clean CSV file, onHover's Table Capture converts any HTML table on the page to a downloadable CSV with one click. The CSV imports correctly into Excel, Google Sheets, and any CSV-compatible tool.

How do I save a webpage as a PDF in Chrome?

Chrome's built-in print dialog (Ctrl+P) can save to PDF via the 'Save as PDF' destination. For better formatting control, onHover's Page Export saves the page as PDF via the browser's print mechanism with options to include or exclude backgrounds and headers. For a cleaner output that preserves only the page content (not browser chrome), the Markdown or HTML export options produce cleaner documents than PDF for text-heavy pages.

How do I convert a webpage to Markdown?

Chrome doesn't have a built-in webpage-to-Markdown converter. onHover's Page Export converts the current page's main content to Markdown by parsing the DOM and translating headings, paragraphs, links, images, lists, code blocks, and tables into their Markdown equivalents. The output is useful for feeding page content to AI tools as context, creating documentation from existing web pages, and archiving articles.

What is the best Chrome extension for scraping data from websites?

For structured data extraction without coding, browser extensions are the most accessible approach. onHover extracts emails, tables, and full-page content (Markdown, HTML, PDF, Word). For more complex structured scraping (extracting specific data from multiple pages), tools like Instant Data Scraper and Web Scraper offer visual point-and-click selectors. For programmatic scraping, Puppeteer and Playwright provide full browser automation with JavaScript.

How do I save a webpage as a Word document?

Chrome's Save As dialog saves pages as HTML or as a complete web archive, not as Word documents. onHover's Page Export converts the current page content to a .docx Word file that preserves headings, paragraphs, links, and images. This is useful for creating editable documentation from web content, exporting blog posts for offline editing, or sharing page content with stakeholders who prefer Word over web formats.

How do I find emails on a LinkedIn profile or company page?

Email addresses on LinkedIn are often obfuscated through zero-width characters, rendering tricks, or JavaScript manipulation to prevent simple scraping. onHover's Email Finder strips zero-width Unicode characters that sites inject to break email pattern matching, scans both innerText and textContent (which catches elements hidden from innerText), and extracts from script tags where some sites embed profile data as JSON. Note that extracting personal emails for unsolicited marketing may violate platform terms of service and applicable law.

Can I export web data to Google Sheets directly?

There's no direct export to Google Sheets from a browser extension without an API integration. The standard workflow is: export the data as CSV from onHover's Table Capture, then import the CSV into Google Sheets via File → Import. For recurring imports, Google Sheets has an IMPORTHTML function that can pull tables from public web pages directly: =IMPORTHTML('https://example.com', 'table', 1).

Test API Endpoints Directly From Your Browser — No Postman Needed Inject CSS and JavaScript Into Any Page Without a Backend