Skip to content

Instantly share code, notes, and snippets.

@shantanuo
Created October 17, 2025 07:14
Show Gist options
  • Save shantanuo/4383fe10ac218e1d57ae4082db2840c3 to your computer and use it in GitHub Desktop.
Save shantanuo/4383fe10ac218e1d57ae4082db2840c3 to your computer and use it in GitHub Desktop.
include the metadata from the old files, if available.
# old.txt
about
best/SGD st:good
is
not
you
_____
# new.txt
all
about
best/SGDM
that/M
upper
very
you
_____
# expected file after merge
final.txt
all
about
best/SGDM st:good
that/M
upper
very
you
_____
awk '
# Stage 1: Process the first file (old.txt) to find metadata.
NR==FNR {
# Create a key by stripping tags (e.g., "best/SGD st:good" -> "best").
key = $0
sub(/\/.*/, "", key)
# Find the position of the first space in the line.
pos = index($0, " ")
if (pos > 0) {
# If a space exists, store the part of the string from the space onwards.
# For "best/SGD st:good", this stores " st:good" with the key "best".
old_meta[key] = substr($0, pos)
}
next
}
# Stage 2: Process the second file (new.txt) to produce the final output.
{
# Create the key for the current line from new.txt.
key = $0
sub(/\/.*/, "", key)
# Check if we stored metadata for this key from old.txt.
if (key in old_meta) {
# If a match is found, print the current line from new.txt
# and append the stored metadata from old.txt.
print $0 old_meta[key]
} else {
# If there is no match or the match had no space-metadata,
# print the line from new.txt as is.
print $0
}
}
' old.txt new.txt > final.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment