Skip to content

Instantly share code, notes, and snippets.

View Dinour's full-sized avatar

Noureddine RAMDI Dinour

View GitHub Profile
@Dinour
Dinour / EntityBase.php
Created May 12, 2020 00:46
symfony doctrine updatedAt createdAt updated_at created_at fields timestamp
<?php
namespace AppBundle\Mapping;
use Doctrine\ORM\Mapping as ORM;
use DateTime;
/**
* Class EntityBase
*

I’m looking for any tips or tricks for making chrome headless mode less detectable. Here is what I’ve done so far:

Set my args as follows:

const run = (async () => {

    const args = [
        '--no-sandbox',
        '--disable-setuid-sandbox',
        '--disable-infobars',
// Reference: https://blog.apify.com/how-to-make-headless-chrome-and-puppeteer-use-a-proxy-server-with-authentication-249a21a79212
const puppeteer = require('puppeteer');
const proxyChain = require('proxy-chain');
const { PROXYMESH_USER, PROXYMESH_PASSWORD } = require('../config/keys');
(async () => {
const oldProxyUrl = `http://${PROXYMESH_USER}:${PROXYMESH_PASSWORD}@de.proxymesh.com:31280`;
const newProxyUrl = await proxyChain.anonymizeProxy(oldProxyUrl);
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
// Emitted when the DOM is parsed and ready (without waiting for resources)
page.once('domcontentloaded', () => console.info('✅ DOM is ready'));
// Emitted when the page is fully loaded
@Dinour
Dinour / script.js
Created January 14, 2020 15:33 — forked from elog08/script.js
ScrapeInfiniteList.js
module.exports = function() {
return new Promise((resolve, reject) => {
// Class for Individual Thread
const C_THREAD = '.pagedlist_item:not(.pagedlist_hidden)';
// Class for threads marked for deletion on subsequent loop
const C_THREAD_TO_REMOVE = '.pagedlist_item:not(.pagedlist_hidden) .TO_REMOVE';
// Class for Title
const C_THREAD_TITLE = '.title';
// Class for Description
@Dinour
Dinour / scrape.js
Created January 14, 2020 15:33 — forked from elog08/scrape.js
Scrape with Puppeteer
const puppeteer = require('puppeteer')
const script = require('./script');
const { writeFileSync } = require("fs");
function save(raw) {
writeFileSync('results.json', JSON.stringify(raw));
}
const URL = 'https://www.quora.com/search?q=meaning%20of%20life&type=answer';
@Dinour
Dinour / app.php
Created August 18, 2019 20:08 — forked from mosampaio/app.php
How to consume a json Rest API with PHP using Guzzle
<?php
require __DIR__ . '/vendor/autoload.php';
use GuzzleHttp\Client;
$client = new Client([
'base_uri' => 'https://api.github.com',
'timeout' => 5.0,
]);