How do I Import an RSS feed as WordPress posts properly(without duplicates)?

We can import an RSS feed to WordPress as posts or other Custom Post Type( CTP ) posts using some WordPress and PHP functions like wp_schedule_event(), wp_insert_post(), simplexml_load_file() etc., like the below:

add_action('import_rss_feed', 'func_import_rss_feed');

// The action will trigger when someone visits your WordPress site
function func_run_schedule_event() {
    if ( !wp_next_scheduled( 'import_rss_feed' ) ) {
        wp_schedule_event( current_time( 'timestamp' ), 'daily', 'import_rss_feed');
    }
}
add_action('wp', 'func_run_schedule_event');

function func_import_rss_feed() {
    $url = 'https://wordpress.org/?feed=rss';
	$xml = simplexml_load_file( $url );
	
	foreach( $xml->channel->item as $item ) {
		$title = (string)$item->title;
		$link = (string)$item->link;
		$post_content = (string)$item->description;
		$args = array(
			'fields' => 'ids',
			'post_type'   => 'post',
			'post_status' => 'publish',
			'meta_query'  => array(
				array(
					'key' => '_link',
					'value' => $link
				)
			)
		);
		$pre_posts = new WP_Query( $args );
		
		if( empty( $pre_posts->posts ) ) {

			$post_data = array(
				'post_title' => $title,
				'post_type' => 'post',
				'post_content' => $post_content,
				'post_status' => 'publish',
				'meta_input' => array(
					'_link' => $link,
				)
			);
			$post_id = wp_insert_post( $post_data );
		}
		
	}
}

If we want, we can also assign all the imported posts to a specific category when inserting posts passing the post_category parameter in wp_insert_post() function. Please check here for more details.

Let’s describe the above code step by step:

Use of wp_schedule_event() function:

// The action will trigger when someone visits your WordPress site
function func_run_schedule_event() {
    if ( !wp_next_scheduled( 'import_rss_feed' ) ) {
        wp_schedule_event( current_time( 'timestamp' ), 'daily', 'import_rss_feed');
    }
}
add_action('wp', 'func_run_schedule_event');

In func_run_schedule_event(), first, we need to check the next timestamp for an event( which means retrieves the next time the event will occur ). If not found, then wp_schedule_event() will be called( which means schedules a hook that will be triggered by WordPress at the specified interval ). wp_schedule_event() expect 4 parameters( $timestamp, $recurrence, $hook, $args ). Here first 3 parameters are required, and the last 1 is optional. For more details, please check here.

Use of simplexml_load_file() and wp_insert_post() functions:

Now the below action will trigger when someone visits your WordPress site if the scheduled time has passed.

add_action('import_rss_feed', 'func_import_rss_feed');

function func_import_rss_feed() {
    $url = 'https://wordpress.org/?feed=rss';
	$xml = simplexml_load_file( $url );
	
	foreach( $xml->channel->item as $item ) {
		$title = (string)$item->title;
		$link = (string)$item->link;
		$post_content = (string)$item->description;
		$args = array(
			'fields' => 'ids',
			'post_type'   => 'post',
			'post_status' => 'publish',
			'meta_query'  => array(
				array(
					'key' => '_link',
					'value' => $link
				)
			)
		);
		$pre_posts = new WP_Query( $args );
		
		if( empty( $pre_posts->posts ) ) {

			$post_data = array(
				'post_title' => $title,
				'post_type' => 'post',
				'post_content' => $post_content,
				'post_status' => 'publish',
				'meta_input' => array(
					'_link' => $link,
				)
			);
			$post_id = wp_insert_post( $post_data );
		}
		
	}
}

In this callback function( func_import_rss_feed() ) I used 2 function name simplexml_load_file() and wp_insert_post(). The simplexml_load_file() function is used to get feed data from remote URLs, and the wp_insert_post() function used to insert those feed data programmatically in WordPress as a blog post.

Leave a Reply

Your email address will not be published. Required fields are marked *