Can you say that again? You can say that again!
A few weeks ago I was chatting with coralina and she linked me 4:19 of The Zipf Mystery but every time he repeats a word it loops.
It's an instance of a meme format I don't think I had seen before. The basic conceit is, as the title states, every time a word that has been said before is said again, the video loops back to that time.
As I interpret it, the instances of words inside of looped sections don't count for determining the "last time" each word has been said, though it's only a little extra work to implement that interpretation as well.
As a Rakunaut it of course didn't take me long to make an attempt at implementing a script that turns a bit of text into the "but every time it repeats a word it loops it repeats the text in between".
After a bit of hacking, I fed the introductory paragraph of the raku website into my script and got the following result. Each time a repetition is done, the word that caused the repetition is printed in red blue, and the repeated text is printed in green:
Raku Programming language. Raku Programming language. Raku has been
developed by a team of dedicated and enthusiastic open source
developers and enthusiastic open source developers and continues to be
developed. by a team of dedicated and enthusiastic open source
developers and continues to be developed. You can help too. The Raku
Programming language. Raku has been developed by a team of dedicated
and enthusiastic open source developers and continues to be developed.
You can help too. The only requirement is Camelia. I'm the spokesbug
for the Raku Programming language. Raku has been developed by a team
of dedicated and enthusiastic open source developers and continues to
be developed. You can help too. The only requirement is that you can
help too. The only requirement is that you know how to be developed.
You can help too. The only requirement is that you know how to be
developed. You can help too. The only requirement is that you know how
to be nice to be nice to all kinds of dedicated and enthusiastic open
source developers and continues to be developed. You can help too. The
only requirement is that you know how to be nice to all kinds of
people (and continues to be developed. You can help too. The only
requirement is that you know how to be nice to all kinds of people
(and butterflies). Go to all kinds of people (and butterflies). Go to
#raku has been developed by a team of dedicated and enthusiastic open
source developers and continues to be developed. You can help too. The
only requirement is that you know how to be nice to all kinds of
people (and butterflies). Go to #raku (irc.libera.chat) and
butterflies). Go to #raku (irc.libera.chat) and someone will be nice
to all kinds of people (and butterflies). Go to #raku
(irc.libera.chat) and someone will be glad to #raku (irc.libera.chat)
and someone will be glad to help too. The only requirement is that you
know how to be nice to all kinds of people (and butterflies). Go to
#raku (irc.libera.chat) and someone will be glad to help you know how
to be nice to all kinds of people (and butterflies). Go to #raku
(irc.libera.chat) and someone will be glad to help you get started.
There's definitely some funny bits in there. My favorites include:
You can help too. The only requirement is Camelia.
You can help too. The only requirement is that you can
help too. The only requirement is that you know how to be developed.
Go to #raku (irc.libera.chat) and someone will be glad to help you know how to be nice to all kinds of people (and butterflies).
I think I might make a recording of reading the text and edit it to do the correct looping, maybe I'll see if Whisper can give precise per-word timestamps that I could turn into a command line with sox
or ffmpeg
to create the final result.
But for now, I'll go through the actual code I used for this. You can already look at and play with the final version on Compiler Explorer here.
The version I linked to on Compiler Explorer begins with a tiny implementation of Terminal::ANSIColor
's sub colored
:
when "green" {
"\e[31m" ~ $what ~ "\e[0m"
}
when "red" {
"\e[32m" ~ $what ~ "\e[0m"
}
}
The alternative is of course to use Terminal::ANSIColor
, but Compiler Explorer doesn't have raku libraries yet. For this case it doesn't really matter that it only supports green and red, and trying to choose any other color just makes no text come out at all
Oh and to top it off, I accidentally switched the codes for green and red around in the sub, and I have the same switch-around in the code that uses the sub, so both mistakes cancel each other out here. Don't look too closely, haha
Next is the text we want to put in. Since I was prototyping this with my code editor (vim) and executing it again after making a change, I didn't want to paste the source text in every time. For that reason, the input text is part of the source file, instead of reading from $*IN
(aka stdin). It could have gone into a separate file as well just as easily with my @input = "text.txt".IO.words
for example.
Hi, my name is Camelia. I'm the spokesbug for the
Raku Programming language. Raku has been developed
by a team of dedicated and enthusiastic open source
developers and continues to be developed. You can
help too. The only requirement is that you know how
to be nice to all kinds of people (and butterflies).
Go to #raku (irc.libera.chat) and someone will be
glad to help you get started.
].words;
I chose the Q
quoting construct here with square brackets because square brackets aren't in the source text, but using heredocs with Q:to/INPUT-TEXT/
for example would have been just as clean.
In that case, the .words
can go directly after the Q
while the input text goes below, with indentation if you like, followed by a line with just INPUT-TEXT
in it. The .words
method makes line wrapping and indentation in the output
Next up, we do a loop over the input array. Using the .pairs
method on the array will give us a Pair
object each iteration that has a .key
with the index of the item and a .value
of the word in question.
The result of the for
loop goes directly into a result variable. For that purpose, we take the for
that by itself is a statement and adapt it into an expression with the do
prefix. That lets us put the result of every iteration directly into our array:
You can see that instead of giving a variable to put the pair object into, we just use the default, which is $_
, the "topic variable". This lets us refer to .key
and .value
just like that.
Next up, inside the for
loop we declare a state
variable to hold information about words we've seen already. A state
variable behaves like a variable you declared outside of the loop in terms of keeping values from one round to the next, but is only visible inside of the curly braces. I find that this makes it a bit clearer where the variable belongs. After the loop it is no longer relevant, and trying to address it there is just a case of "undeclared variable".
I mentioned earlier that the .words
method gives us a list of consecutive non-whitespace, and that includes punctuation. We don't want the punctuation to be counted when looking up when a word was seen the last time, and also want to count capitalized and lower cased versions of words as the same, so we normalize the words before looking them up or storing them in our %last
hash:
We use a simple regex with the .comb
method that gives us every alphabetical character from the input, joins them into one string without spaces, and turns it into fold-case (it's kind of like lower case, but different for some scripts.)
The next few lines set up the logic to put the index we saw the word at into the hash. Since we want to get whatever was already in the hash before we assign the new value, we have a few ways to make that happen, but the implementation I chose here is a LEAVE
block, which is executed when the body of the loop has finished.
I made the choice to use LEAVE
rather than just putting the code at the end of the block because I'm also using the last statement of the loop body to give the value that the for loop puts into the result list.
The block itself is pretty straight-forward:
%last{$keyword} = .key;
}
When leaving the for block, we set the value in %last
for the $keyword
to the .key
, i.e. the index of the word from the input list.
We're almost done!
We now want to grab a "previous position" from the hash, if it exists, and make the repetition happen. Otherwise, the word just goes straight through to the result:
colored(.value, "green"),
@input[$prevp ^.. .key].map({ colored($_, "red") })
} else -> $nothing {
.value;
}
The with
construct lets us check a value for definedness and assign it into a variable for the block.
The if
statement can do the same variable assignment, but it checks for truth value. The very first word in our array would have the index 0, which would count as False, and not execute the block.
So for the result of our iteration in case there is a previous position for our keyword should be the word itself, followed by the repeated content. $prevp
is the index where the current word was seen before and .key
will give us the current index. We use ^..
which creates a Range just like ..
, but skips the first value.
We use colored
for the .value
as well as every word we copied out of the @input
list with the []
postcircumfix operator to make the first word green and the copied words red respectively.
Now that I've looked at the code again and again for writing this post, it occurs to me that there's not really a good reason to pass every word individually through the colored
sub with a map
. Instead, I could have turned the list I took out of the @input
array into a String joined by spaces, which is conveniently exactly what the .Str
method on it would do. That can then be fed into colored
and we've saved maybe a third of the whole line.
Ah well, what can you do! I'm not really golfing the code down to the shortest it could possibly be. It would probably look a bit different if I did
For the case where we didn't actually have an entry in the %last
hash yet for the $keyword
we would land in the else
branch of this construct. We take the value we got into a named variable so that our $_
doesn't get scribbled over. We could still refer to the $_
from the outer block with $OUTER::_
but I thought that's less pleasing.
All that this block needs to do is get the .value
out from the pair so it's just the word, and it's done!
Here's the whole loop in one uninterrupted piece:
state %last;
my $keyword = .value.comb(/<alpha>/).join("").fc;
LEAVE {
%last{$keyword} = .key;
}
with %last{$keyword} -> $prevp {
colored(.value, "green"),
@input[$prevp ^.. .key].map({ colored($_, "red") })
} else -> $nothing {
.value;
}
}
Now all that's left is to print it out to the terminal.
Just putting the text on the screen as one long string doesn't look good, so I want it word-wrapped. There is a method called naive-word-wrapper
on the Str
class, however it is marked is implementation-detail.
What that means is that we get no guarantees that it will stay around, or behave the same on a different version of rakudo. It's also not expected to be present on other implementations of Raku. For this use case, I think it's totally fine. If the method is gone, we can just output the string without any wrapping of words, and maybe expect our caller to pipe it through some program that does word wrapping.
Incidentally, when trying that out, I found that neither fmt nor par understand that ANSI color formatting codes have zero visible width when printed
Even though the naive-word-wrapper implements greedy line wrapping like fmt
rather than an algorithm that tries to find a globally optimal solution for how many words should go on each line which par
has, the result still looks a lot more correct since it actually strips color formatting codes before doing its calculations
Again, you can copy out or play with the whole code, put in your own input text, try to make the code shorter, or whatever you like by following this link to Compiler Explorer.
Normally I'd tell you to leave a comment if you liked the post, but I haven't set up anything yet that would make that easy. Maybe soon I will have the experimental Ghost ActivityPub thing running? But until then, you can reply to this toot.
If you don't have an account that can post to the fediverse, you can also find me on IRC, on the raku mailing list, and if there's a discussion on one of the typical social media discussion sites I might see it.
I hope you'll come back when I publish my next post! Don't forget this blog has an RSS feed