Here is why I think you should pay notice, too! π
This is a classic problem in software engineering: you create a branch, you make some changes, and then you create a Git commit. In theory your commit is self-contained: it is a documentation update, or it is a bug fix, or it is a new feature, etc.
Disciplined and self-contained commits are great because the Git history becomes very readable.
It is also very easy to selectively drop a commit if something goes wrong (git revert
) or port a given change to another (maintenance) branch (git cherry-pick
).
Of course theory and practice tend to disagree, especially as we work under time-sensitive constraints, so we often end up with commits that mix several changes in one, or branches with series of commits that should really be just one.
Another problem is that of writing proper Git commits. After all, a Git commit message is loosely defined with the first line being a title / summary of the changes, and the longer body providing more details, as in:
Fixes a race condition in the concatMap operator
Fixes concurrent signals handling leading to an inconsistent state,
especially with the termination signals of the inner and outer
subscribers.
Fixes: #666
QA-Approver: MrBean
So how are conventional commits any better than this?
The previous Git commit message was relatively well-structured:
Conventional commits are nothing but taking this approach a step further by adding a structure to commit messages. Back to this example, this would give the following commit:
fix(operators): race condition in the concatMap operator
Fixes concurrent signals handling leading to an inconsistent state,
especially with the termination signals of the inner and outer
subscribers.
Issue: #666
QA-Approver: MrBean
While this might look like a cosmetic change, this message has more structure!
fix
means that the change is a bug-fix. Other common types can be feat
(feature), docs
(documentation updates), refactor
(refactoring), etc. In fact, you can create your own conventions around it, although the Angular conventions are both widely accepted and fairly complete.operators
scope gives more context: the fix applies to some βoperatorsβ area of the code base. Scoping is optional, though.The structure of a conventional commit message is as simple as:
<type>[optional scope]: <description>
[optional body]
[optional footer(s)]
Note that the same ideas can be found in other approaches such as GitMoji.
Sure, emojis are fun to have in commits, but I personally find it easier to decipher that a commit is a bug fix when it starts with fix:
rather than π
(but my friend Philippe probably thinks otherwise π).
Conventional commits can be parsed by tools, and a very nice use-case is that of generating release changelogs.
Here is a screenshot of what it looks like for the release of Mutiny 2.5.6:
I introduced JReleaser as part of the Mutiny release process when I made the project adopt conventional commits. The tool is able to group commits by kind (e.g., features, documentation updates, bug fixes, etc). It also provides a summary of the merged-pull requests, provided not just the pointers to such pull requests but also those of the fixed issues.
Shall your next version be 2.6.0 or 2.5.6?
If you are maintaining a library then sticking to semantic versioning should be a no-brainer.
This is, again, another good example where the theory is nice but practice slips π€£ This can be due to the marketing value of a version number, or just due to the fact that you have a bunch of changes and the last release was 3 months ago so you decide to raise the minor version number.
We have all done that, but as library consumers it is quite easy to see how rigorous semantic versioning helps.
Conventional commits make it quite easy to decide what the next release number shall be. I know that some projects leave it completely to release scripts to decide on the version number by inspecting commits. I personally prefer to inspect the Git history and have the last word:
$ git log --oneline --no-merges
b9c3b93e build(deps): bump codecov/codecov-action from 3.1.4 to 3.1.5
2ec49880 chore(release): set development version to 999-SNAPSHOT
bc3ba4fd (tag: 2.5.6) chore(release): release Mutiny 2.5.6
f18296bb (origin/fix/concatmap-early-null-innerUpstream) fix(concatMap): deadlock on inner upstream subscription
796003cc fix(concatMap): check for early null inner subscriber
32fdd3e3 build(deps): bump org.assertj:assertj-core from 3.25.1 to 3.25.2
9dc8bdcd chore(release): set development version to 999-SNAPSHOT
a5fca500 (tag: 2.5.5) chore(release): release Mutiny 2.5.5
be54f155 (origin/fix/1494) fix: race condition on cancellation in UniCallbackSubscriber
4811b4b4 (origin/refactor/concatmap-no-cas-on-unbounded) refactor: avoid a compare&swap on unbounded requests
b8da91f3 chore(release): clear RevAPI breaking change justifications
c26a308f chore(release): set development version to 999-SNAPSHOT
In this short excerpt you can see that the last commits between tags did not have features (feat: xyz
), hence hinting at patch releases.
I have to admit that before adopting conventional commits I could have arbitrarily done minor rather than patch releases.
The practice of conventional commits might also help me in deciding to delay the merge of a given pull-request. If I have bug fixes and new features in the pipe then I might first have a quick patch release, then merge the new features to plan a new minor (or even major) release.
In fact I believe library maintainers shall not be afraid to frequently bump the major release number. If your web browser is at version 121 then why donβt you let your library be at version 12 if you canβt avoid breaking changes, even low-impact ones? At least downstream consumers of your library will be aware that you take versionning seriously.
This might sound counter-intuitive, but conventional commits can be liberating! How is that possible, since each commit should be nicely self-contained?
The trick is that because you know that you eventually need to expose conventional commits in your pull-requests, you will not be tempted to make half-backed commits.
There are various ways to achieve this, but I suggest you have a look at my previous blog on scratchpad branch workflows. The idea is pretty simple:
If your project is hosted on GitHub and uses GitHub Action, then it is quite easy to check that a pull-request meets conventional commits.
There are several options that I had tested, but the one that worked better is wagoid/commitlint-github-action.
You can have a simple job in your workflow that looks like this, and it will by default use the Angular conventions:
conventional-commits:
runs-on: ubuntu-latest
name: Check conventional commits
steps:
- uses: actions/checkout@v4
- uses: wagoid/commitlint-github-action@v5
Some people use local Git hooks to make sure that people do not commit wrong commits in the first place, but this is too much for me.
This again applies to projects hosted on GitHub. If you are using dependabot to help you keeping dependencies up-to-date, then you need to configure it so it makes conventional commits.
Simply edit your .github/dependabot.yml
file to look like:
version: 2
updates:
- package-ecosystem: maven
directory: "/"
schedule:
interval: daily
commit-message:
prefix: "build"
include: "scope"
open-pull-requests-limit: 10
- package-ecosystem: github-actions
directory: "/"
schedule:
interval: weekly
commit-message:
prefix: "build"
include: "scope"
The relevant part is in the commit-message
object, which gives you commits of the form:
build(deps): bump org.assertj:assertj-core from 3.25.1 to 3.25.2
The only minor glitch and well-known issue is that dependabot will make description lines that can be too long for the linter to pass.
In my case I regularly have dependabot pull-requests that fail the wagoid/commitlint-github-action
checks just because it makes for long lines.
This easily happens with long Maven coordinates.
There are two options:
wagoid/commitlint-github-action
configuration with relaxed custom rules (I will leave this as an exercice to the astute reader as we said in my past professional life π).I hope that this blog post will have motivated you to explore conventional commits. I donβt use them in all of my projects, but I found them to be useful in the important ones that I maintain, with Mutiny being a good showcase as it is a critical component of larger projects such as Quarkus.
At first conventional commits look a bit weird and you will repeatedly wonder what is the format as you make commits. Still, they will quickly become a second nature and you will realise the benefits in terms to your software engineering processes.
At the very least they will be a useful companion when it comes to planning, crafting and performing releases. And perhaps you will finally have that clean Git history, just like in the textbooks π
]]>It is about time for the modern Java ecosystem to migrate away from the legacy APIs (org.reactivestreams:reactive-streams
Maven coordinates) and adopt the interfaces in java.util.concurrent.Flow
.
I have recently started migrating the Mutiny and Mutiny Zero libraries and thought these notes would be useful to others as well.
The good news is that the legacy and the Flow
APIs are isomorphic.
For instance org.reactivestreams.Publisher<T>
becomes java.util.concurrent.Flow.Publisher<T>
.
One option is to perform string replacements to move from one API to the other, but an IDE like IntelliJ can help you with API migrations (see Refactor > Migrate Packages and Classes
):
The bad news is that moving from one API to the other could be a breaking change for your own code bases.
If your code relies on a high-level implementation of Reactive Streams then the change will be mostly transparent at the source code level.
For instance the Hibernate Reactive library uses Mutiny and none of the low-level Reactive Streams types such as Publisher
, hence the migration of Mutiny to the JDK Flow
APIs requires no change in Hibernate Reactive.
By contrast RESTEasy Reactive does support exposing endpoints using org.reactivestreams.Publisher<T>
return types (and not just, say, Multi<T>
from Mutiny), so the migration requires more work than just bumping a dependency version.
Flow
types will be supported.The reactive-streams-jvm
project contains adapters to go back and forth from legacy types to Flow
types.
You might as well use the adapters that I developed and maintain as part of Mutiny Zero.
Suppose that you have a library that has yet to migrate to Flow
APIs.
You can easily turn a Publisher<T>
into a Flow.Publisher<T>
:
Publisher<String> rsPublisher = connect("foo"); // ... where 'connect' returns a Publisher<String>
Flow.Publisher<String> flowPublisher = AdaptersToFlow.publisher(rsPublisher);
Type adapters exist for the 4 interfaces of Reactive Streams, and they have virtually no cost.
While the Reactive Streams APIs are fairly simple, the evil is in the protocol and semantics. This is why publishers, processors and subscribers need to pass the Reactive Streams TCK.
There is fortunately a Flow
variant of the TCK, so if you have implemented Reactive Streams the changes will be minimal as you transition to Flow
.
First, the TCK dependency Maven coordinates will become org.reactivestreams:reactive-streams-tck-flow
.
Next, you will need to move your test classes from org.reactivestreams.tck.PublisherVerification<T>
as a base class to org.reactivestreams.tck.flow.FlowPublisherVerification<T>
.
The rest of your TCK test code will be the same, except that some method names have Flow
in them: createPublisher(long)
becomes createFlowPublisher(long)
, etc.
You can see that in one of the test cases from Mutiny Zero.
Migrating to the JDK Flow
APIs is important for the modern Java ecosystem, especially as Reactive Streams APIs have been part of the JDK since Java 9.
The migration is fairly transparent for application developers as they are unlikely to be directly using the low-level Reactive Streams types. This is instead the duty of frameworks, libraries and drivers to do this transition and impose one less dependency in application stacks.
The migration in itself isnβt too hard to perform as types are isomorphic, but there is an inevitable transition period for stacks where multiple dependencies need to be aligned past Java 8 and on top of the JDK Flow
APIs.
Type adapters represent a virtually no-cost solution when alignment is not possible yet.
The most important part for Reactive Streams implementers remains its TCK as the guardian of interoperability between various libraries.
As the TCK already ships with a Flow
variant, migrating away from the legacy APIs wonβt break the behavior and interoperability of Reactive Streams implementations.
Go has a great tutorial about fuzzing. The idea behind fuzzing is not to replace traditional tests but rather to complement them by (randomly) iterating over input values to the code under test. This is helpful to find bugs and security issues on data whose domain are numbers, byte arrays or strings.
Using fuzzing to detect a division by zero error is most likely a bad idea if the code under test takes a parameter that is directly used to divide, as in:
func DoSomeMath(a int, b int, c int) int {
return (a + b) / c
}
Catching that c
shall not be 0
is possible with fuzzing as at some point c
will be 0
, but it should really be one of the first test cases you write.
Now of course if your code performs a division by a number whose link with the function arguments is not so obvious then fuzzing might help.
So letβs take a quite simple example: palindromes.
1221
, //--//
, madam
and eye
are valid palindromes, while foo
is not.
Letβs start with a first iteration of a IsPalindrome
function:
func IsPalindrome(str string) bool {
first := 0
last := len(str) - 1
for first <= last {
if str[first] != str[last] {
return false
}
first++
last--
}
return true
}
I took a very simple approach to the code, with indexes at both ends of the string that converge to the middle as long as characters are identical.
Letβs have a simple tabular test to cover some basic cases:
func TestIsPalindrome(t *testing.T) {
tests := []struct {
str string
want bool
}{
{
str: "eye",
want: true,
},
{
str: "1221",
want: true,
},
{
str: "//--//",
want: true,
},
{
str: "foo",
want: false,
},
}
for _, tt := range tests {
t.Run(tt.str, func(t *testing.T) {
if got := IsPalindrome(tt.str); got != tt.want {
t.Errorf("IsPalindrome() = %v, want %v", got, tt.want)
}
})
}
}
Letβs test:
$ go test
PASS
ok yolo/playground 0.239s
$
Great! How about coverage?
$ go test -coverprofile=coverage.out
PASS
coverage: 100.0% of statements
ok yolo/playground 0.170s
With 100% of statements, our code must be great⦠right?
Fuzzing works for types such as numbers, strings, byte arrays, boolean values, etc.
If your input data is some struct
then you will need to feed its fields with some fuzzed data.
The fuzzing engine canβt magically generate random struct
values π€£
Hereβs a fuzzing test:
func FuzzIsPalindrome(f *testing.F) {
f.Add("kayak")
f.Fuzz(func(t *testing.T, str string) {
t1 := IsPalindrome(str)
t2 := reverse(str) == str
if t1 != t2 {
t.Fail()
}
})
}
The testing.F
type is for fuzzing.
The Add
function allows passing some seed data for each argument of the function given to Fuzz
.
Since we have just 1 parameter for fuzzing we just pass 1 string.
This will be the value of str
at the first iteration, then the engine will derive some other (random) strings.
Checking failures requires having some way to check results.
This can be a challenge with fuzzing since you donβt know in advance what is the outcome given the input values.
In this case we use a custom reverse
function that reverses a string, so itβs a cheap way to check the behavior of our IsPalindrome
function (more on reverse
in the next section).
In other cases you might rely on the code under test to report an error or even panic.
Your mileage varies, but it can sometimes be difficult to find a way to report when tests pass and when they fail.
So what happens when we run tests just like before?
$ go test -v
=== RUN TestIsPalindrome
=== RUN TestIsPalindrome/eye
=== RUN TestIsPalindrome/1221
=== RUN TestIsPalindrome///--//
=== RUN TestIsPalindrome/foo
--- PASS: TestIsPalindrome (0.00s)
--- PASS: TestIsPalindrome/eye (0.00s)
--- PASS: TestIsPalindrome/1221 (0.00s)
--- PASS: TestIsPalindrome///--// (0.00s)
--- PASS: TestIsPalindrome/foo (0.00s)
=== RUN FuzzIsPalindrome
=== RUN FuzzIsPalindrome/seed#0
--- PASS: FuzzIsPalindrome (0.00s)
--- PASS: FuzzIsPalindrome/seed#0 (0.00s)
PASS
ok yolo/playground 0.246s
$
We can see that the fuzz test case as been used with the seed data.
Now letβs run some proper fuzzing:
$ go test -fuzz FuzzIsPalindrome
fuzz: elapsed: 0s, gathering baseline coverage: 0/8 completed
fuzz: minimizing 264-byte failing input file
fuzz: elapsed: 0s, gathering baseline coverage: 3/8 completed
--- FAIL: FuzzIsPalindrome (0.02s)
--- FAIL: FuzzIsPalindrome (0.00s)
Failing input written to testdata/fuzz/FuzzIsPalindrome/530aa3ce17341fb6fbfd1f28e61b116d8a1f20c03b796122175963d7a7863256
To re-run:
go test -run=FuzzIsPalindrome/530aa3ce17341fb6fbfd1f28e61b116d8a1f20c03b796122175963d7a7863256
FAIL
exit status 1
FAIL yolo/playground 0.270s
$
Oops! We have a bug! π
We have a new file under testdata
, a folder where you can place all files useful for your package tests but that compilation will ignore.
This file tells us which input string caused the failure:
$ cat testdata/fuzz/FuzzIsPalindrome/530aa3ce17341fb6fbfd1f28e61b116d8a1f20c03b796122175963d7a7863256
go test fuzz v1
string("11\xc311")
$
In any case our regular tests now take this input data into account to catch regressions:
$ go test
--- FAIL: FuzzIsPalindrome (0.00s)
--- FAIL: FuzzIsPalindrome/530aa3ce17341fb6fbfd1f28e61b116d8a1f20c03b796122175963d7a7863256 (0.00s)
FAIL
exit status 1
FAIL yolo/playground 0.214s
$
Note that since that file catches a bug it shall be under version control.
So letβs go back to the failed test: the input string is not a correct UTF-8 string.
We can fix the IsPalindrome
using the unicode/utf8
package ValidString
function:
func IsPalindrome(str string) bool {
if !utf8.ValidString(str) {
return false
}
first := 0
last := len(str) - 1
for first <= last {
if str[first] != str[last] {
return false
}
first++
last--
}
return true
}
And now weβre back to green tests:
$ go test
PASS
ok yolo/playground 0.242s
$
Are we done?
Letβs do some more fuzzing for 15 seconds:
$ go test -fuzz FuzzIsPalindrome -fuzztime 15s
fuzz: elapsed: 0s, gathering baseline coverage: 0/9 completed
fuzz: elapsed: 0s, gathering baseline coverage: 9/9 completed, now fuzzing with 12 workers
fuzz: elapsed: 0s, execs: 1406 (5198/sec), new interesting: 2 (total: 11)
--- FAIL: FuzzIsPalindrome (0.27s)
--- FAIL: FuzzIsPalindrome (0.00s)
Failing input written to testdata/fuzz/FuzzIsPalindrome/b102348c25c69890607f026bc3186f5faf9de089188791a75c97daf5fdd10caa
To re-run:
go test -run=FuzzIsPalindrome/b102348c25c69890607f026bc3186f5faf9de089188791a75c97daf5fdd10caa
FAIL
exit status 1
FAIL yolo/playground 0.526s
$
Another failure! πΏ
$ cat testdata/fuzz/FuzzIsPalindrome/b102348c25c69890607f026bc3186f5faf9de089188791a75c97daf5fdd10caa
go test fuzz v1
string("Γ")
$
It turns out that we should work on runes (aka the string as a bytes array) rather than accessing string elements by index.
A good hint is the reverse
function we use in tests and that we copy/pasted from somewhere on the Grand Internet:
func reverse(str string) string {
r := []rune(str)
var res []rune
for i := len(r) - 1; i >= 0; i-- {
res = append(res, r[i])
}
return string(res)
}
So letβs do the same and work on runes:
func IsPalindrome(str string) bool {
if !utf8.ValidString(str) {
return false
}
r := []rune(str)
first := 0
last := len(r) - 1
for first <= last {
if r[first] != r[last] {
return false
}
first++
last--
}
return true
}
We are now back to green:
$ go test
PASS
ok yolo/playground 0.257s
$
And letβs do some more fuzzing:
$ go test -fuzz FuzzIsPalindrome -fuzztime 15s
fuzz: elapsed: 0s, gathering baseline coverage: 0/12 completed
fuzz: elapsed: 0s, gathering baseline coverage: 12/12 completed, now fuzzing with 12 workers
fuzz: elapsed: 3s, execs: 223057 (74327/sec), new interesting: 24 (total: 36)
fuzz: elapsed: 6s, execs: 223057 (0/sec), new interesting: 24 (total: 36)
fuzz: elapsed: 9s, execs: 503462 (93490/sec), new interesting: 26 (total: 38)
fuzz: elapsed: 12s, execs: 538411 (11648/sec), new interesting: 27 (total: 39)
fuzz: elapsed: 15s, execs: 580859 (14151/sec), new interesting: 28 (total: 40)
fuzz: elapsed: 16s, execs: 580859 (0/sec), new interesting: 28 (total: 40)
PASS
ok yolo/playground 16.336s
$
No more failures, we seem to be much better now! π
We just saw test fuzzing in Go.
My workflow to implement features is not very surprising:
main
branch,Still, there are times when I need to explore various designs, and doing so takes several days or even weeks. In such cases I would use a mix of:
git stash
to discard failed attempts but still be able to go back to them, andgit rebase
to synchronize with the latest progress in the main
branch, andgit rebase -i HEAD~N
(e.g., with N = 3
if I have 2 commits) to squash changes and reduce intermediate draft commits to one.Iβve recently shifted to a new workflow that allows me to make exploratory branches, keep track of all intermediate steps, and finally offer a clean pull-request when ready.
main
branch, and I prefix it with scratchpad/
to signal the intent: git switch -c scratchpad/yolo
WIP
as a comment: git commit -am 'WIP'
, git commit -am 'Adding docs'
, etcscratchpad/
branch any time I need to explore another design by going back to step 1main
to capture any future conflict: git rebase origin/main
git push myfork scratchpad/yolo --set-upstream
Deriving a clean branch is easy with a soft reset:
git switch scratchpad/yolo
git switch -c feature/yolo
git reset --soft origin/main
git commit -a
Starting from here the feature/yolo
branch has a clean commit with the whole feature, while the scratchpad/yolo
branch remains visible somewhere with all the steps.
Once in a while we would get build failures in GitHub Action runners, and of course we could not reproduce them locally. Even repeating a test a thousand times would not reproduce the failure seen in the runners. And of course, there was not much determinism in which test could fail.
Still, the logs would hint at tasks being rejected by terminated Java executors, so I started digging. I went through the usage of executors in tests, but aside from a few trivial fixes, all executors were being used as they should be, as in:
// Get an executor
var executor = Executors.newFixedThreadPool(4);
// Do stuff
doThingsWith(executor);
// Shut it down
executor.shutdownNow();
I then started tracking calls to shutdown()
and shutdownNow()
, to try and see if we had some code, somewhere, that would shut an executor down.
Nothing in tests, but I eventually found a call to shutdown
in that class from the JDK:
static class FinalizableDelegatedExecutorService
extends DelegatedExecutorService {
FinalizableDelegatedExecutorService(ExecutorService executor) {
super(executor);
}
protected void finalize() {
super.shutdown();
}
}
Guess what?
This class is used by⦠Executors.newSingleThreadExecutor()
!
public static ExecutorService newSingleThreadExecutor() {
return new FinalizableDelegatedExecutorService
(new ThreadPoolExecutor(1, 1,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>()));
}
So if you create an executor using newSingleThreadExecutor()
, then the actual executor is being wrapped in a class whose sole purpose is to call shutdown()
in a finalizer.
See the first paragraph of the documentation of the (now-deprecated!) Object.finalize()
method:
Called by the garbage collector on an object when garbage collection determines that there are no more references to the object. A subclass overrides the finalize method to dispose of system resources or to perform other cleanup.
It is to be expected that in a constrained environment such as that of a CI/CD runner, the garbage collector has to run more frequently than on a 32Gb of RAM laptop. Depending on how your code is written, you may end up in cases where an executor gets finalized before it has received all tasks and they get rejected.
In the case of Mutiny tests flakiness was greatly reduced by replacing calls of:
var executor = Executors.newSingleThreadExecutor();
with:
var executor = Executors.newFixedThreadPool(1);
Indeed newSingleThreadExecutor()
is the sole method that wraps executors with FinalizableDelegatedExecutorService
.
Now should you do the same to your code base and ditch newSingleThreadExecutor()
?
I donβt think so!
Executors.newFixedThreadPool(1)
, even if thereβs a method with the correct name for the purpose.
At some point in future Java releases finalizers will be gone and Executors.newSingleThreadExecutor()
will have the same runtime behavior, but meanwhile you can avoid some potential headaches.Executors.newSingleThreadExecutor()
and you see weird task rejections then thereβs a good chance you hit the same problem!This paper was co-authored with Arthur Navarro, ClΓ©ment Escoffier and FrΓ©dΓ©ric Le MouΓ«l.
Modern services running in cloud and edge environments need to be resource-efficient to increase deployment density and reduce operating costs. Asynchronous I/O combined with asynchronous programming provides a solid technical foundation to reach these goals. Reactive programming and reactive streams are gaining traction in the Java ecosystem. However, reactive streams implementations tend to be complex to work with and maintain. This paper discusses the performance of the three major reactive streams compliant libraries used in Java applications: RxJava, Project Reactor, and SmallRye Mutiny. As we will show, advanced optimization techniques such as operator fusion do not yield better performance on realistic I/O-bound workloads, and they significantly increase development and maintenance costs.
Julien Ponge, Arthur Navarro, ClΓ©ment Escoffier, and FrΓ©dΓ©ric Le MouΓ«l. 2021. Analysing the Performance and Costs of Reactive Programming Libraries in Java. In Proceedings of the 8th ACM SIGPLAN International Workshop on Reactive and Event-Based Languages and Systems (REBLS β21), October 18, 2021, Chicago, IL, USA. ACM, New York, NY, USA, 10 pages. DOI PDF
With my friends Yannick and Philippe we have decided to re-ignite the development of Eclipse Golo. We are converging towards a 3.4.0 release after 2 years of hiatus, and we are doing contributions at our own (leisure) pace.
This has been a great occasion to re-consider how releases would be published.
π‘ You can get all the source code and automation from the Eclipse Golo project on GitHub.
Golo needs to publish 2 types of release artifacts:
Golo used to be released using a fairly manual process:
./gradlew publish
to upload to Bintray, with my credentials for the Gradle build being safely stored in ~/.gradle/gradle.properties
on my computerThis is clearly a manual process where empowering somebody else like Yannick whoβs the project co-leader is harder than it should be.
With the new process that I recently put in place the whole deployment happens in GitHub Actions.
master
branch triggers a deployment to Sonatype OSS. Depending on the version defined in the Gradle build file then this will be a snapshots publication or a full release to Maven Central.milestone/3.4.0-M4
, release/3.4.0
) creates a (draft) GitHub release, and the corresponding distribution archive is attached to the release for general availability consumption. The draft is manually made public after some release notes text is added.This means that now any trusted committer can bump the version, create a tag and push to GitHub, and the GitHub Actions workflow will figure out what to do.
The biggest challenge here compared to the previous process is that we need the workflow to be able to sign artifacts with a GnuPG key, and it needs to have the credentials to publish to Sonatype OSS.
Letβs dive into how we publish to Maven Central from GitHub Actions, and using Gradle.
Publishing with Gradle to Maven Central is well-documented.
First define the following plugins:
plugins {
// (...)
`java-library`
`maven-publish`
signing
}
Next you have to create publications and define repositories so Gradle knows what files to publish, and where:
publishing {
publications {
create<MavenPublication>("main") {
artifactId = "golo"
from(components["java"])
pom {
name.set("Eclipse Golo Programming Language")
description.set("Eclipse Golo: a lightweight dynamic language for the JVM.")
url.set("https://golo-lang.org")
inceptionYear.set("2012")
developers {
developer {
name.set("Golo committers")
email.set("golo-dev@eclipse.org")
}
}
licenses {
license {
name.set("Eclipse Public License - v 2.0")
url.set("https://www.eclipse.org/org/documents/epl-2.0/EPL-2.0.html")
distribution.set("repo")
}
}
scm {
url.set("https://github.com/eclipse/golo-lang")
connection.set("scm:git:git@github.com:eclipse/golo-lang.git")
developerConnection.set("scm:git:ssh:git@github.com:eclipse/golo-lang.git")
}
}
}
}
repositories {
maven {
name = "CameraReady"
url = uri("$buildDir/repos/camera-ready")
}
maven {
name = "SonatypeOSS"
credentials {
username = if (project.hasProperty("ossrhUsername")) (project.property("ossrhUsername") as String) else "N/A"
password = if (project.hasProperty("ossrhPassword")) (project.property("ossrhPassword") as String) else "N/A"
}
val releasesRepoUrl = "https://oss.sonatype.org/service/local/staging/deploy/maven2/"
val snapshotsRepoUrl = "https://oss.sonatype.org/content/repositories/snapshots/"
url = uri(if (isReleaseVersion) releasesRepoUrl else snapshotsRepoUrl)
}
}
}
Here we define a publication called main
, and use some Gradle embedded domain-specific language to customise the Maven pom.xml
generation.
We also define 2 repositories:
CameraReady
is for checking locally what the generated publication looks like, andSonatypeOSS
points to the actual Sonatype OSS repositories.We get the Sonatype OSS credentials from project properties ossrhUsername
and ossrhPassword
but ensure we use a bogus "N/A"
value so people can still build the project even if they donβt have these properties defined.
We also use a boolean value isReleaseVersion
which is defined as:
val isReleaseVersion = !version.toString().endsWith("SNAPSHOT")
This allows us to point to the correct Sonatype OSS repository.
We also need to instruct Gradle to sign the publication artifacts:
signing {
useGpgCmd()
sign(publishing.publications["main"])
}
To check what the published artifacts would look like run:
$ ./gradlew publishAllPublicationsToCameraReadyRepository
then check the files tree:
$ exa --tree build/repos/camera-ready
build/repos/camera-ready
βββ org
βββ eclipse
βββ golo
βββ golo
βββ 3.4.0-SNAPSHOT
β βββ golo-3.4.0-20201218.172135-1-javadoc.jar
β βββ golo-3.4.0-20201218.172135-1-javadoc.jar.asc
β βββ golo-3.4.0-20201218.172135-1-javadoc.jar.asc.md5
β βββ golo-3.4.0-20201218.172135-1-javadoc.jar.asc.sha1
β βββ golo-3.4.0-20201218.172135-1-javadoc.jar.asc.sha256
β βββ golo-3.4.0-20201218.172135-1-javadoc.jar.asc.sha512
β βββ golo-3.4.0-20201218.172135-1-javadoc.jar.md5
β βββ golo-3.4.0-20201218.172135-1-javadoc.jar.sha1
β βββ golo-3.4.0-20201218.172135-1-javadoc.jar.sha256
β βββ golo-3.4.0-20201218.172135-1-javadoc.jar.sha512
β βββ golo-3.4.0-20201218.172135-1-sources.jar
β βββ golo-3.4.0-20201218.172135-1-sources.jar.asc
β βββ golo-3.4.0-20201218.172135-1-sources.jar.asc.md5
β βββ golo-3.4.0-20201218.172135-1-sources.jar.asc.sha1
β βββ golo-3.4.0-20201218.172135-1-sources.jar.asc.sha256
β βββ golo-3.4.0-20201218.172135-1-sources.jar.asc.sha512
β βββ golo-3.4.0-20201218.172135-1-sources.jar.md5
β βββ golo-3.4.0-20201218.172135-1-sources.jar.sha1
β βββ golo-3.4.0-20201218.172135-1-sources.jar.sha256
β βββ golo-3.4.0-20201218.172135-1-sources.jar.sha512
β βββ golo-3.4.0-20201218.172135-1.jar
β βββ golo-3.4.0-20201218.172135-1.jar.asc
β βββ golo-3.4.0-20201218.172135-1.jar.asc.md5
β βββ golo-3.4.0-20201218.172135-1.jar.asc.sha1
β βββ golo-3.4.0-20201218.172135-1.jar.asc.sha256
β βββ golo-3.4.0-20201218.172135-1.jar.asc.sha512
β βββ golo-3.4.0-20201218.172135-1.jar.md5
β βββ golo-3.4.0-20201218.172135-1.jar.sha1
β βββ golo-3.4.0-20201218.172135-1.jar.sha256
β βββ golo-3.4.0-20201218.172135-1.jar.sha512
β βββ golo-3.4.0-20201218.172135-1.module
β βββ golo-3.4.0-20201218.172135-1.module.asc
β βββ golo-3.4.0-20201218.172135-1.module.asc.md5
β βββ golo-3.4.0-20201218.172135-1.module.asc.sha1
β βββ golo-3.4.0-20201218.172135-1.module.asc.sha256
β βββ golo-3.4.0-20201218.172135-1.module.asc.sha512
β βββ golo-3.4.0-20201218.172135-1.module.md5
β βββ golo-3.4.0-20201218.172135-1.module.sha1
β βββ golo-3.4.0-20201218.172135-1.module.sha256
β βββ golo-3.4.0-20201218.172135-1.module.sha512
β βββ golo-3.4.0-20201218.172135-1.pom
β βββ golo-3.4.0-20201218.172135-1.pom.asc
β βββ golo-3.4.0-20201218.172135-1.pom.asc.md5
β βββ golo-3.4.0-20201218.172135-1.pom.asc.sha1
β βββ golo-3.4.0-20201218.172135-1.pom.asc.sha256
β βββ golo-3.4.0-20201218.172135-1.pom.asc.sha512
β βββ golo-3.4.0-20201218.172135-1.pom.md5
β βββ golo-3.4.0-20201218.172135-1.pom.sha1
β βββ golo-3.4.0-20201218.172135-1.pom.sha256
β βββ golo-3.4.0-20201218.172135-1.pom.sha512
β βββ maven-metadata.xml
β βββ maven-metadata.xml.md5
β βββ maven-metadata.xml.sha1
β βββ maven-metadata.xml.sha256
β βββ maven-metadata.xml.sha512
βββ maven-metadata.xml
βββ maven-metadata.xml.md5
βββ maven-metadata.xml.sha1
βββ maven-metadata.xml.sha256
βββ maven-metadata.xml.sha512
The first thing is to create a GnuPG signing key:
$ gpg --gen-key
You will be asked for a name and email, choose whatever is relevant for your project. In the case of Golo the key that I created is for Eclipse Golo developers
with the email of the development mailing-list: golo-dev@eclipse.org
. Also make sure to note the passphrase for signing, weβll need it in a minute.
Maven Central checks that artifacts are being signed, and the key needs to be available from one of the popular key servers.
To do that get the fingerprint of your (public) key, then publish it:
$ gpg --fingerprint golo-dev@eclipse.org
$ gpg --keyserver http://keys.gnupg.net --send-keys FINGERPRINT
where FINGERPRINT
isβ¦ the fingerprint π
Now export the secret key to a file called golo-dev-sign.asc
:
$ gpg --export-secret-key -a golo-dev@eclipse.org > golo-dev-sign.asc
π¨ This private key will be used for signing, so make sure you donβt accidentally leak it. Make especially sure you donβt commit it!
Gradle looks for gradle.properties
files in various places. If you have that file in your root project folder then it will be used to pass configuration to the build file.
Fill this file with relevant data:
ossrhUsername=YOUR_LOGIN
ossrhPassword=YOUR_PASSWORD
signing.gnupg.keyName=FINGERPRINT
signing.gnupg.passphrase=PASSPHRASE
where:
YOUR_LOGIN
/ YOUR_PASSWORD
are from your Sonatype OSS account, andFINGERPRINT
/ PASSPHRASE
are for the GnuPG key that you created above.π¨ Again be careful not to leak this file because it contains credentials!
So we have both gradle.properties
and golo-dev-sign.asc
that contain sensitive data. We want these files to be available only while the CI/CD workflow is running, so they will be stored encrypted in the Git repository.
To do that, letβs define some arbitrarily complex password and store it temporarily in the GPG_SECRET
environment variable. GnuPG offers AES 256 symmetric encryption:
$ gpg --cipher-algo AES256 --symmetric --batch --yes --passphrase="${GPG_SECRET}" --output .build/golo-dev-sign.asc.gpg golo-dev-sign.asc
$ gpg --cipher-algo AES256 --symmetric --batch --yes --passphrase="${GPG_SECRET}" --output .build/gradle.properties.gpg gradle.properties
We now have .build/golo-dev-sign.asc.gpg
and .build/gradle.properties.gpg
that can be safely stored in Git. Sure anyone in the world can have these files, but without the password all they can do is a brute force attempt against AES 256 encrypted files.
To publish artifacts we need to run the Gradle publish
task. However we need Gradle to know about the credentials first, so the encrypted files have to be decrypted.
Here is the .build/deploy.sh
script that we have for that purpose:
#!/usr/bin/env bash
set -euo pipefail
IFS=$'\n\t'
function cleanup {
echo "π§Ή Cleanup..."
rm -f gradle.properties golo-dev-sign.asc
}
trap cleanup SIGINT SIGTERM ERR EXIT
echo "π Preparing to deploy..."
echo "π Decrypting files..."
gpg --quiet --batch --yes --decrypt --passphrase="${GPG_SECRET}" \
--output golo-dev-sign.asc .build/golo-dev-sign.asc.gpg
gpg --quiet --batch --yes --decrypt --passphrase="${GPG_SECRET}" \
--output gradle.properties .build/gradle.properties.gpg
gpg --fast-import --no-tty --batch --yes golo-dev-sign.asc
echo "π¦ Publishing..."
./gradlew publish
echo "β
Done!"
This script assumes that the GPG_SECRET
environment variable holds the password for the AES 256 encrypted files, then moves them to the project root folder.
Note that for what itβs worth the script defines a trap to always remove the decrypted files.
Now comes the final piece of the puzzle: the workflow definition.
There are many ways one can write such workflow. In the case of Golo I opted to go with a single workflow and a single job to do everything, but do not take it as the golden solution. You may want to have separate jobs, separate workflows, etc. It all depends on your project requirements and what you want to automate.
The full workflow is as follows.
The workflow only requires that you define a secret called GPG_SECRET
in your GitHub project (or organisation) settings. This secret is the golden key to everything else, since the 2 encrypted files contain your credentials for signing artifacts and uploading them to Sonatype OSS.
This workflow is linear with many steps being conditional depending on what trigger the run.
The first steps are always run: we setup Java, we checkout and build the project, and attach the distribution archive to the GitHub action run.
Golo uses a convention where release tags are prefixed with milestone/
and release/
. We consequently can test when a GitHub release has to be created because a tag has been pushed (if: startsWith(github.ref, 'refs/tags/')
) and when it shall be marked as a release or a pre-release (prerelease: startsWith(github.ref, 'refs/tags/milestone/')
).
Note that the GitHub release is created as a draft here because we prefer to make it live manually from the GitHub interface, but you may just directly publish it. You can also define some text / release notes using the actions/create-release
action, possibly generated from a script of yours.
The deployment step is only enabled for pushes to the master
branch (if: github.ref == 'refs/heads/master'
) that call the .build/deploy.sh
shell script from above.
This workflow works well for a project like Golo. Again you can have a more complex workflow if that suits your needs better, or you may want to trigger workflow from other events. This is really up to you.
At the time of the writing AES 256 is considered safe if you have a complex and long password.
Please keep in mind that you are still uploading your credentials to someone elseβs computers!
Your credentials are encrypted in a public Git repository, and they will be decrypted while the deployment script runs.
It is a very good idea to periodically update the encryption password, and rotate the passwords in the encrypted files.
The workflow above attaches a distribution of Golo to each build.
This is great because nightly builds are available as a distribution one can download from the corresponding workflow runs. Still, you donβt want to hit quotas and pollute servers with everything youβve built, so you can use another GitHub Action workflow like this one for cleaning old artifacts:
# Copied from https://poweruser.blog/storage-housekeeping-on-github-actions-e2997b5b23d1
name: 'Nightly artifacts cleanup (> 14 days)'
on:
schedule:
- cron: '0 4 * * *' # every night at 4 am UTC
jobs:
delete-artifacts:
runs-on: ubuntu-latest
steps:
- uses: kolpav/purge-artifacts-action@v1
with:
token: $
expire-in: 14days
So far this workflow does not publish an updated website.
This is left for future work π
]]>So you know ls
(often found as /bin/ls
), the good old Unix command to list files in a directory.
I recently came across exa, a modern replacement for ls
. It is part of a wave of new command-line tools written in Rust and that bring modernity while staying faithful to the Unix way of writing focused and composable tools.
Of course you may wonder why switching from ls
is any good idea. It turns out that exa
is really a better ls
, with good colour support, customisable output, a humane interface and even git
metadata support (so you can see which files are being ignored, staged, etc).
The default behavior of exa
is to⦠list files, pretty much like ls
would do:
The equivalent of ls -la
is exa --long --all
:
Note that by default file sizes are given in a human-friendly form.
If you are in a Git repository you can also get metadata by adding the --git
flag to any command:
Note that reading Git metadata can slow down the execution of exa
commands, so I personally tend to use the --git
flag only when I actually need it.
You can also inspect trees with the --tree
flag:
There is also a --recurse
flag to list files in each directory of the file tree:
Typing exa
instead of ls
is one more character, and youβll likely have to fight muscle memory. In my case I am trying to get rid of typing ls -lsa
π
You can easily define a few aliases so exa
becomes your new ls
. Note that exa
is not fully compatible with ls
. For instance ls -lsa
(which I am fighting) results in an error with exa -lsa
because the -s
flag requires an argument to define a sort field.
Here are my personal aliases:
# A few aliases for exa, a ls replacement
alias l="exa --sort Name"
alias ll="exa --sort Name --long"
alias la="exa --sort Name --long --all"
alias lr="exa --sort Name --long --recurse"
alias lra="exa --sort Name --long --recurse --all"
alias lt="exa --sort Name --long --tree"
alias lta="exa --sort Name --long --tree --all"
alias ls="exa --sort Name"
Feel-free to take inspiration and define aliases and default flags that make sense to you!
]]>I am happy to announce that my book Vert.x in Action (Asynchronous and Reactive Java) has been published! π₯³
I have so many people to thank that the best is to read the acknowledgements section of the book π
Writing this book has been a long and fun journey. I wrote this book with the goal of teaching concepts that will still be relevant in the years to come, and I hope that you will learn useful lessons from it!
Have fun and take care π
]]>Keybase has been acqui-hired by Zoom. Time will tell if Keybase will remain actively developed and secured, but perhaps it will be better to just go back to plain GnuPG with the pains if you need a key pair.
β
I recently decided to revoke a 10 years old GnuPG key pair that I was using across machines, and decided to start from a clean sheet. I wanted to ensure I could continue using GnuPG to sign opensource release materials, but also sign public Git commits. Until then all I used to sign were Git tags.
As I wanted to find a better solution than just using plain GnuPG and its numerous practicability flaws, I gained renewed interest in Keybase, especially as it now provides more than just a streamlined experience with encryption tools.
The configuration steps are adapted from Patrick Stadlerβs instructions on GitHub. There is a macOS bias in some of the commands which you can easily adapt to other Unix systems π
I have a βlove - hate - hate - love - hate - hate - hateβ relation with GnuPG.
This tool has a horrible user interface, and I have never really found it useful for communications. Over 15 years I have had a few GnuPG-encrypted email communications with colleagues or friends, but it has always been a hindrance. Also while it did encrypt communication content, there is enough meta-data in plain text with email (title, recipients, etc) to make GnuPG email encryption a half-baked solution to a real problem.
Still, GnuPG is useful because we may have files to encrypt so their content can only be read by ourselves and maybe a few people. We may also want to sign files for integrity checks. This is especially important with opensource development where signing source and release artifacts is a plus, and often a necessity.
Generating a key pair with GnuPG is not very difficult, and under 2 minutes you can have one and push it to some public key servers. The problem is that once you have a key pair then no one really knows if the identity claimed is real or not. So you can get to your friends or at so-called βkey signing partiesβ and sign other people key to claim that you have verified that some public key does belong to the person it claims to be. By doing so keys form trust networks, which helps recognizing plausibly authentic versus fake keys.
In practice no one but a few geeks or activists will want to do that seriously. You will likely do it with a few friends and colleagues once in a while, andβ¦ that will be it. And of course people will loose their keys and they will not even have a revocation key π
The Keybase service was introduced a few years back with the interesting idea of mapping social / public identities to encryption keys.
Keybase essentially introduced a modern way to build trust networks, because people can check that user123
on Keybase is also user123_coder
on GitHub, user123
on Twitter, and that the user owns the iamuser123.org
domain.
One can do so by posting Keybase-signed messages to these social networks and places.
This is a good idea, because from social networks and Keybase accounts linked to those social networks, you can build a trust network on Keybase.
Keybase also offered streamlined web and command line user interfaces for managing Keybase, following people and encrypting / decrypting content. Keybase provides a simplified workflow for common tasks including key management (e.g., fetching keys of your contacts), and it has GnuPG interoperability. You may even make the devilish (read: convenient) choice of having your keys stored on Keybase, or just attach existing GnuPG keys that you will still manage locally.
Like many people I on-boarded when the service opened and it went viral on Twitter. But then like many people I never really used it because, well, Iβm not using GnuPG everyday anyway.
I believe that Keybase deserves a second wave of interest, because the modern Keybase is way more interesting than just mapping identities to encryption keys.
Indeed Keybase now offers:
me,other
is shared between 2 users),This is interesting as everything is encrypted. There are many contexts where using Keybase makes sense, such as research groups in Universities. This is a context where institutions will typically provide you with bad tools and services, refrain you from using well-known tools, and the boundaries of who you work with are quite malleable since you work with people at other institutions and companies. Here Keybase can be a secure replacement for chat, file sharing and (unpublished) source code management tools.
So how can we we use Keybase and also make GnuPG friendly to other tools like Git?
On macOS with Homebrew all you need is:
brew install gnupg
brew cask install keybase
For other types of installation please refer to the Keybase website download section.
You will then want to use keybase login
to either register your machine or create a new account.
You can also use the desktop client for a friendlier experience.
You will want to claim identity proofs in various places and services: Twitter, GitHub, your website, your domain name, etc.
You can do so with keybase prove
or the desktop client.
Last but not least, you will want to follow people: keybase follow
is your friend π
Now you need Keybase to generate a GnuPG key for you:
keybase pgp gen --multi
The --multi
flag will allow you to generate a key with multiple name / email addresses.
In my case I have 2 personal email addresses and my Red Hat work email that Iβm also using for opensource contributions.
Once this is done you run the following command to know the identifier of your secret key:
gpg --list-secret-keys --keyid-format LONG
And of course note the identifier for your public key, here in another format:
gpg --list-keys --keyid-format 0xshort
Various services like Maven Central will want your public key to be available from a trusty key server.
You can use gpg
to send your key to various key servers, as in:
gpg --keyserver pgp.mit.edu --send-keys IDENTIFIER_OF_YOUR_PUBLIC_KEY
You may find it equally useful to use the web interfaces of a few popular key servers to paste and upload your public key.
In that case first copy your public key to the clipboard (pbcopy
is macOS specific):
gpg --armor --export ONE_OF_YOUR_EMAIL_ADDRESS | pbcopy -
then go to a few places:
Your key will quickly be synchronized between a network of public key servers.
Your signing key is your private key identifier. With that information, enable commit signing globally:
git config --global user.signingkey PRIVATE_KEY_IDENTIFIER
git config --global commit.gpgsign true
If you are on macOS you will need to install pinentry-mac
:
brew install pinentry-mac
and then edit ~/.gnupg/gpg-agent.conf
so it contains the following line:
pinentry-program /usr/local/bin/pinentry-mac
The first time you do a signed commit you will be prompted to enter your secret key passphrase, and you will be offered to save it in your macOS user keychain. If you do so then you will automatically sign commits and tags without having to worry about the passphrase.
You can now tell your Git repository hosting services about your key, so it can show that your commits have been signed and that the signature is yours:
gpg --armor --export ONE_OF_YOUR_EMAIL_ADDRESS | pbcopy -
),I encountered a few issues with the Gradle signing plugin. I could not make it use the GnuPG agent, and I had to let it use the default which is to use a secret key ring file.
Edit ~/.gradle/gradle.properties
so all your projects share the same configuration.
You will need 3 signing-specific entries:
signing.keyId=0x1234
signing.password=my-secret-password
signing.secretKeyRingFile=/Users/user123/.gnupg/secring.gpg
Replace signing.keyId
with your private key identifier, signing.password
with the key password, and replace /Users/user123/
with the path to your user account.
You may also want to lock down the file permissions with chmod
so only your account can read it (remember, your passphrase is in plain text).
The secring.gpg
file may not exist if this is a first install, so run this command:
gpg --keyring secring.gpg --export-secret-keys > ~/.gnupg/secring.gpg
What happens if you have another machine to provision, be it as a replacement or as a complement?
Assuming that you created your GnuPG key from Keybase, it is stored and managed by Keybase. All you need to do is login on the new machine with your Keybase account, then:
keybase pgp list
should give your GnuPG key identifier. You can then import the public and private keys as follows:
keybase pgp export -q IDENTIFIER | gpg --import
keybase pgp export --secret -q IDENTIFIER | gpg --import --allow-secret-key-import
Encryption experts will complain, especially if you let Keybase store your private keys, but:
Keybase + GnuPG sounds like a nice combo.
π By the way you can find me on Keybase at https://keybase.io/jponge.
Ping me there and let me know if this was useful to you π
Thanks again to Patrick Stadler for the original instructions.
Have fun!
]]>