--- title: Architecture Recommendations excerpt: '' deprecated: false hidden: false metadata: title: '' description: '' robots: index next: description: '' --- There are a few different ways to build Content Gateway integrations: * Using system APIs * Using web scrapers # Strategy 1: Use System APIs System APIs offer a structured, programmatic way to retrieve data directly from the source system. This is the most robust approach for building Content Gateway integrations. For this approach, you need to follow 3 steps: 1. Conduct source system API discovery (including API endpoints and authentication). 2. Create a server that can host [Content Gateway APIs](https://help.moveworks.com/reference/get_files). You can middleware tools or host your own server. 3. Return content using source system APIs every time your Gateway APIs are invoked. # Strategy 2: Use Web Scrapers Web scraping can be used when source systems APIs are unavailable, though it comes with significant challenges. For this approach, you need to follow 3 steps: 1. Build a web scraper to crawl and retrieve content from source systems. You may need to use external libraries such as [Beautiful Soup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) or [Selenium](https://www.selenium.dev/) depending on your purpose. 2. Create a server that can host [Content Gateway APIs](https://help.moveworks.com/reference/get_files). You can middleware tools or host your own server. 3. Return content by scraping content from source systems every time your Gateway APIs are invoked. # Comparison of approaches While you can build gateway integrations either using source system APIs or web scrapers. We **highly recommend using source system APIs**, since scrapers can easily break and are unreliable. Here is a detailed comparison of the 2 approaches: | **Aspect** | **Web Scraping (Cons)** | **Why system APIs are better** | | ---------------------- | ----------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- | | **Reliability** | Scrapers depend on the structure of the site which can change without notice, easily breaking your integration. | APIs are designed for structured data access with stable endpoints. | | **Data Precision** | Scraped data often includes unnecessary parsed data such as HTML headers, footers, images or Javascript snippets. | APIs provide precise, well-defined data that ensures higher Copilot accuracy. | | **Performance** | Scraping is much slower because it involves rendering and parsing HTML, JavaScript, and CSS. | APIs are optimized for performance, allowing efficient data retrieval. | | **Scalability** | Requires additional effort to handle rate limits, paginated data, and large datasets efficiently. | APIs are designed with scalability in mind, including features like rate limits and pagination. | | **Access Permissions** | Scraping may not support authenticated access or require complex workarounds to handle web login. | APIs have secure authentication methods like OAuth and offer robust permission management. |