Skip to content

Instantly share code, notes, and snippets.

View edsu's full-sized avatar

Ed Summers edsu

View GitHub Profile

openpgp4fpr:DD11F92F1E44644183C06961D012FF557AFFF80A

#!/usr/bin/env python3
#
# This program demonstrates using the Tableau REST API to print out our embedding settings.
# To run it you will need to create a Personal Access Token by:
#
# 1. visiting https://tableau-uat.stanford.edu/
# 2. clicking on your user name in the top right
# 3. select "My Settings"
# 4. scrolling to the "Personal Access Tokens" section
@edsu
edsu / getall.py
Last active October 3, 2025 19:28
import dotenv
from podbucket import oai
from podbucket.oai import XML_NS
dotenv.load_dotenv()
for count, rec in enumerate(oai.list_records("503")):
ds = rec.find(".//oai:header/oai:datestamp", namespaces=XML_NS).text
print(ds, count)
@edsu
edsu / README.md
Last active October 1, 2025 21:49

If you execute ./run.sh browsertrix-crawler will be started up to crawl https://www.trm.dk/nyheder and run a behaviour to fetch all the page results and then feed all the discovered URLs to the crawl queue.

@edsu
edsu / flotilla_df.py
Last active June 6, 2025 14:45
Track the progress of the Freedom Flotilla in a DataFrame https://freedomflotilla.org/ffc-tracker/
import requests
import pandas
url = "https://flotilla-orpin.vercel.app/api/vessel"
df = pandas.DataFrame.from_records(requests.get(url).json()["vessels"]["232057367"]["positions"])
df.last_position_UTC = pandas.to_datetime(df.last_position_UTC)
print(df)
docker run \
--publish 9037:9037 \
-v $PWD/crawls:/crawls/ \
webrecorder/browsertrix-crawler crawl \
--seeds https://www.womenonweb.org/af/ \
--seeds https://www.womenonweb.org/ar/ \
--seeds https://www.womenonweb.org/de/ \
--seeds https://www.womenonweb.org/en/ \
--seeds https://www.womenonweb.org/es/ \
--seeds https://www.womenonweb.org/fa/ \
hostname ip provider
research.noaa.gov 3.171.38.80 Amazon Technologies Inc.
research.noaa.gov 3.171.38.79 Amazon Technologies Inc.
research.noaa.gov 3.171.38.59 Amazon Technologies Inc.
research.noaa.gov 3.171.38.3 Amazon Technologies Inc.
epic.noaa.gov 108.138.64.54 Amazon.com, Inc.
epic.noaa.gov 108.138.64.32 Amazon.com, Inc.
epic.noaa.gov 108.138.64.6 Amazon.com, Inc.
epic.noaa.gov 108.138.64.49 Amazon.com, Inc.
adp.noaa.gov 18.165.98.15 Amazon Technologies Inc.
research.noaa.gov
epic.noaa.gov
adp.noaa.gov
ci.noaa.gov
oeab.noaa.gov
orta.research.noaa.gov
testbeds.noaa.gov
qosap.research.noaa.gov
oss.research.noaa.gov
eeo.oar.noaa.gov
#!/usr/bin/env python3
import fileinput
import re
import subprocess
from functools import cache
@cache
def ips(hostname):
@edsu
edsu / err.log
Created March 27, 2025 13:58
Error output
Traceback (most recent call last):
File "/Users/edsu/.pyenv/versions/3.13.0/bin/sciop", line 8, in <module>
sys.exit(_main())
~~~~~^^
File "/Users/edsu/Projects/sciop/src/sciop/cli/main.py", line 16, in _main
main(max_content_width=100)
~~~~^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/edsu/.pyenv/versions/3.13.0/lib/python3.13/site-packages/click/core.py", line 1161, in __call__
return self.main(*args, **kwargs)
~~~~~~~~~^^^^^^^^^^^^^^^^^