-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User.id for authenticated user id #1104
Comments
hey @heyams - user namespace is not bounded to |
What I would propose is that we introduce an additional property We could also add a Potentially we could also add in |
there was a discussion about additional sub-namespace for |
Ok with the idea of sub domains, how about we use this to track a discussion about implementing |
Previous discussions (from the client rum sig) #443 |
Ok I see valid points in that discussion how about We introduce the following:
For the device attributes how about we also introduce a session attribute? |
Sessions only lasts as long as the browser is open, and it's a different concept.
alternatively, we can add @trisch-me @MSNev thoughts on this? |
not a fan of to be honest i don't see an issue an issue with tracking the users session via a
|
I think having a separate attribute for an anonymous user id makes sense. I'd keep For an unauthenticated user, I'd recommend a separate attribute that we can try to name ( I think anonymous users are a feature that should be considered separate from user sessions because:
|
In the case of following a "user" across multiple sessions how can we be fairly certain it is the same user? For instance a new user at the kiosk. Propose tracking the actual device as we are using some identifier stored on the device. The key thing for me is the context is bound to a device. I feel that we should leave it to developers to decide what triggers the start of a new session, be it time based or a user clicking start on a kiosk. Perhaps we look at adding a convention such as I think the approach of being able to trace all the activity coming from a device, drilling it down to a session & then taking it that step further to see the data related to an individual user. The final drill down is to see what was down while authenticated including who they are. |
Agree with this completely. I think having separate identifiers for the
These should be thought of as separate examples. In the kiosk use case you would likely not make your unauthenticated user id sticky. In applications where it's more safe to assume that the user is the same (an app on a mobile phone), it would be more useful to persist a longer lived value for their identifier. I would recommend a session identifier alongside an anonymous user id in both these examples. In kiosk mode, the application may decide to keep a session open for multiple customers. When in pocket mode, an application may decide to identify a potential customer for multiple sessions. |
So let me try and summarise the current state of where we at as I see it in short Form:
Open question
I am of the later thought especially if we can also release guidance on session track which includes the following examples at a Min
|
No, the device.id is not the same as an anonymous user id, they are and need to be kept separate. The device.id is specific for the device that (one or more) users are using |
Yes I am aware that multiple unauthenticated users could use the one device. The thing which I am questioning is why we call it anonymous user id, when the user can't come back, multiple people could be involved given that we have no reliable way of knowing when the user switches and instead I propose that we refer to it as The key thing is allowing using a combination of fields depending on the use case to achieve maximum coverage and the seven scenarios described. |
As I have mentioned earlier that session only lasts as long as the browser is open, and it's a different concept. |
Yes I am aware that a session lasts only as long as the browser/app is open. What I am failing to see is how can a anonymous user id be safely reused? For me the tests should be:
Based on the above logic all the id's become complementary & defined scope. Most importantly for me it enables us to see all activity coming from a device during a session and that can be split based on the user.session with those sessions being able to be split based on the authenticated user id |
In the client space you can have
So when there is a single user of a computer then (if provided)
And for when multiple users are using the same (shared) hardware
So we SHOULD NOT confuse the concept of "users" and "sessions" as for client environments they can and are often different. So the "user" attributes identify "who" is doing something (both anonymously and explicitly identified), which the "session" identifies "what" is occurring, so its technically possible to identify across a sequence of requests that how an end user is using a system so it's possible to answer questions like
|
related discussion about user.id |
After reading all the discussions I am in favor of just having |
To add my discussion here from the meeting: My concern is around the use of This attribute means: "We don't know the identity of the user, so we invented an ID to track behavior, e.g. for RUM". What this does NOT mean is "We have an anonymous identifier (removed personally identifying information)". I'd prefer phrase this in some way to make it clear what's happening. E.g. "user.unknown_id", "anonymous_user.id", "user.unauth_id". I believe @MSNev had a good recommendation. |
I recognize the potential for confusion with the term The following options were proposed in today's semantic conventions SIG and after the discussion with my team at Microsoft: ❤️ description: a consistent id to track a best-effort unique user regardless the authentication state. Note: I didn't add It's ok to vote for multiple options. Please vote. |
I think there are two problems leading to confusion:
From browser perspective, it sound like user login should be populated in E.g. we can do:
would it be helpful if we did this instead?
Yes, it'd make it more obvious for browser-specific case, but it would make things in TL;DR: do we really need a new attribute? Can we reuse I think it's the same option as "🚀 user.auth_id", but without introducing an attribute for authenticated user id - we have It'd be great to have a md file for user in the context of browser/website that describes which user properties are applicable and how they should be populated. |
@lmolkova It's not just the browser space, it's clients in general. Generally, for the browser scenario (specifically Azure Monitor), there is the |
do we need to record login and some other id for authenticated user? i.e. why do we need |
Yes, some companies WANT to record the actual person who did the work for their internal auditing. Which is why I voted to "reclaim" |
Would it be better if it was called |
While the example I gave was "associated" with the authenticated details (object id or email), it's not necessarily (100 %) the "login" it could be anything, and just as @jsuereth doesn't like calling it anonymous calling "some" (potentially) user identifying id the "login" is also not correct... What should be recorded in a field called "login", should it be the username they entered during initialization, their associated (primary) email address (what happens when they sign in with a phone number) or some random OTP via a secondary (multifactor) device... Or even worse, they sign in with some 3rd party integration (for facebook, google, microsoft, etc) it's the app internally associates that id with an application "id" (like just a number)... So NO I don't like using |
ok, so from the browser perspective:
The point I'm making is that by adding a new attribute to this namespace will make things even more confusing. The things we need to record for browser users:
It we reuse
An alternative would be to define attributes in a new/different namespace. E.g.:
|
Proposal from @heyams where we always have |
I like this. I think there was some concern about the term
|
I think For a more concrete example of a security use case, Falco alerts can have a field |
let's separate I believe my concerns on |
@lmolkova if we agree on using What do you think about It seems we have reached a consensus via poll to have an authenticated user ID along with another attribute for a different ID: Now, it's just a matter of naming it. |
let's use I suggest semantic-conventions/docs/general/attribute-naming.md Lines 74 to 82 in aea69f2
|
I would propose using |
👋 - I was shimming into this thread while looking for standard authentication span tag conventions. What about Do you also register a I would extend the authentication namespace to this. : # String identifier to describe the authentication methods associated to the context
# Example:
# user.authentication.methods = "pwd,mfa" (ref - https://www.rfc-editor.org/rfc/rfc8176#section-2)
# client.authentication.methods = "mtls"
<identifiable>.authentication.methods = <string list>
# String identifier to describe the subject identifier (aka user_id)
# Exemple:
# user.authentication.identity.subject = "arn:aws:iam::123456789012:user/johndoe"
# client.authentication.identify.subject = "spiffe://example.org/ns/default/sa/default"
<identifiable>.authentication.identity.subject = <string>
# Pseudo-anonymised subject for privacy
# Exemple:
# user.authentication.identity.subject_hash = "5b8491046bd5db5e945654dcc60343b367f181cc642a449c150ddd42e1e4b880" # HEX(HMAC-SHA256($key, $subject))
<identifiable>.authentication.identity.subject_hash = <string> With By the way, I would not recommend using Secondly, using an |
I agree with |
👍 (Next Monday is a holiday in the U.S.A)
Feel free to offer feedback or discuss it in the SIG meeting. |
Hey @heyams. thanks for info. To answer your questions: Regarding |
Based on the discussion in Semconv SIG on 9/30:
Action items:
We'll have another discussion on enduser attributes naming for tracking/anonymous id and authenticated id. |
I have checked for ECS - within Elastic our usual case is actually to use user in the root level, without additional parent namespace. It also makes querying the data easier - you can just search for We do use user with parent namespace in those cases where it's ambiguous - for example Or in cases of one user (actor) performs operation on another user (target) we need to namespace the users to distinguish them. In most cases the context provides enough understanding of the type of user being referenced, making additional namespacing unnecessary. Multiple usage of the same user, or any other namespace, will be supported in I was also thinking about differences between the multiple domains, my suggestion is to make a comparison to understand where do we have differences/unclear field usage. This would help us determine if, in fact, they are not as different as we initially thought. As discussed during the meeting, it might turn out that users/instrumentation could simply skip fields that are not applicable to their specific use case. |
Area(s)
area:user
Is your change request related to a problem? Please describe.
enduser.id has been deprecated and replaced with user.id. #731
enduser.id had this old description:
user.id has this new description:
The new description is confusing now. Is it for authenticated user id or anonymous user id? What are your thoughts on creating a new attribute called
user.anonymous_id
?Our telemetry solution tracks both authenticated user id and anonymous user id.
Describe the solution you'd like
user.anonymous_id
for the anonymous user id.Describe alternatives you've considered
n/a
Additional context
n/a
The text was updated successfully, but these errors were encountered: