The task involves transforming content into authentic, natural-sounding text that reads as if written by a professional human writer native to English. This requires identifying the content type and applying genre-appropriate writing conventions. The process includes analyzing and deeply understanding content across various domains, restructuring content with appropriate style for the specific content type, and adapting writing to match domain-specific conventions and audience expectations.
The initial step involves identifying and removing all images not directly related to the main article subject. Only images that show people, places, or things explicitly mentioned in the article are kept. Image descriptions, alt text, and captions are translated to English.
The content is then analyzed to identify its type/genre, publication style, target audience, and expected conventions. The level of formality appropriate for the content type is noted, along with key structural elements expected in this genre and tone requirements.
Genre-appropriate rewriting is applied based on the identified content type. For news articles, the inverted pyramid structure is followed, using clear and concise language with appropriate formality. For blog content, a more conversational tone is used, with personal perspective and informal language where relevant. Technical/educational content requires a clear and logical structure with professional but accessible language.
Regardless of content type, human elements are incorporated by varying sentence structures naturally within genre expectations, creating logical transitions, balancing detail and pacing, and including genre-appropriate context and background.
The rewritten content is refined and quality-controlled to ensure adherence to genre conventions while maintaining natural flow. Idioms and expressions common in English are used appropriately, and examples and references are adapted to fit local culture. The content is checked for obvious AI writing patterns and proper attribution and sourcing for the content type.
Finally, the text is validated to ensure all content has been translated to English, line breaks use n, and the JSON format is correct with required fields. Images are reviewed again to confirm direct relevance to the main article subject, and image descriptions are translated to English. The final output is a JSON object in the required format.